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Part  I  of  this  report  describes  the  analysis  on  moving  averages,  variances  and  variograms  for  NIR 
from  a  SPOT  image  of  part  of  Fort  A.  P.  Hill.  The  relation  between  elevation  from  a  digital  elevation 
model,  the  raw  elevation  data  and  NIR  and  NDVI  were  explored  in  detail.  The  correlations  are  weak 
in  spite  of  an  apparent  visual  relation.  Part  II  describes  the  analyses  of  the  hyperspectral  ‘hymap’ 
data.  The  results  illustrate  the  difficulty  of  deciding  how  many  and  which  wavebands-  to  retain,  j 
Certain  groups  of  bands  reappear  in  different  analyses,  but  equally  there  are  less  stable  groupings,  j 
The  PCA  results  do  not  discriminate  as  well  as  the  raw  variograms  and  certain  classifications.  Eight 
groups  of  bands  seem  to  appear  regularly.  The  pixel  maps  show  that  even  within  these  groups 
different  information  emerges.  Part  IV  of  the  report  describes  the  analysis  of  the  National  Soil 
Inventory  of  England  and  Wales.  Geostatistical  methods  and  wavelet  analysis  are  compared.  Part 
IV  of  the  report  of  the  project  is  an  aide  memoire  for  sampling.  It  embraces  both  design-based 
sampling,  which  is  based  on  classical  statistics,  and  model-based  sampling  which  is  underpinned 
by  geostatistics.  This  work  is  a  guide  to  sampling  field-based  information  or  pixels  from  images. 
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EXECUTIVE  SUMMARY 


This  report  contains  the  three  interim  reports,  together  with  the  most  recent  results.  In  addition, 
there  is  an  Appendix.  Part  I  describes  the  analysis  on  moving  averages,  variances  and  variograms  for  NIR 
from  a  SPOT  image  of  part  of  Fort  A.  P.  Hill.  The  computer  programs  for  these  have  already  been  provided 
to  the  Topographic  Engineering  Center.  The  maps  of  the  variances  and  the  diagram  of  the  variograms  show 
areas  of  the  image  that  are  not  stationary  and  where  kriging  is  likely  to  perform  less  well  in  prediction  than 
wavelet  analysis.  This  proves  the  findings  of  Project  PR-N  68171-98-M-531 1.  This  section  also  includes  the 
analysis  of  a  part  of  the  image  called  ‘aphillcut’  to  distinguish  it  from  the  larger  section  of  image  used  that  ahs 
been  used.  This  explored  in  detail  any  relation  between  elevation  from  a  digital  elevation  model,  the  raw 
elevation  data  and  NIR  and  NDVI.  The  correlations  are  weak  in  spite  of  an  apparent  visual  relation.  The 
strongest  relation  was  between  the  moving  averages  for  NDVI  and  elevation  for  a  window  of  10  x  10  pixels. 

Part  II  describes  the  analyses  of  the  hyperspectral  ‘hymap’  data.  This  includes  a  correlation  analysis,  principal 
component  analysis  (PCA),  variography,  mapping,  multivariate  non-hierarchical  classification  and  factorial 
kriging.  The  results  illustrate  the  difficulty  of  deciding  how  many  and  which  wavebands  to  retain.  Certain 
groups  reappear  in  different  analyses,  but  equally  there  are  less  stable  groupings.  The  PCA  results  are  not 
particularly  discriminatory,  whereas  those  for  the  raw  variograms  and  certain  classifications  appear  to  identify 
about  eight  groups  of  bands.  The  pixel  maps  show  that  even  within  these  groups  different  information 
emerges.  Further  interpretation  of  some  of  these  results  requires  input  from  personnel  at  TEC. 

Part  IV  of  the  report  describes  the  analysis  of  the  National  Soil  Inventory  of  England  and  Wales.  The  aim  was 
to  compare  geostatistical  methods,  mainly  ordinary  kriging  and  factorial  kriging,  with  wavelet  analysis  on  a 
different  kind  of  data  from  imagery.  The  data  were  from  sampling  locations  on  a  5-km  grid  and  pH  and  zinc 
were  selected  for  analysis.  The  variogram  of  pH  showed  that  there  was  long-range  trend  in  the  data,  which 
had  to  be  removed  for  the  geostatistical  analysis  whereas  the  wavelet  analysis  is  not  affected  by  it.  Zinc  was 
markedly  skewed  and  the  data  were  transformed  to  common  logarithms  for  the  geostatistical  analysis,  which 
again  was  not  necessary  for  the  wavelet  analysis.  The  results  have  shown  some  interesting  features.  There 
appears  to  be  no  local  non-stationarity  in  these  data,  which  meant  that  kriging  performed  better  than  the 
wavelet  analysis  in  terms  of  the  distribution  of  the  errors  for  the  10-km  subsample.  However,  for  the  40-km 
subsample  the  wavelet  analysis  performed  better.  The  variograms  for  both  properties  were  nested  and  the 
short-range  variation  was  evident  in  the  high  frequency  wavelet  transform  for  the  20-km  grid.  The  variogram 
can  provide  a  guide  as  to  what  sampling  interval  should  be  focused  on  in  a  multiresolution  analysis  using 
wavelets. 

Part  IV  of  the  report  of  the  project  is  a  stand-alone  piece  of  work,  mainly  prepared  by  Professor  Richard 
Webster.  This  is  an  aide  memoire  for  sampling.  It  embraces  both  design-based  sampling,  which  is  based  on 
classical  statistics,  and  model-based  sampling  which  is  underpinned  by  geostatistics.  This  work  is  a  guide  to 
sampling  field-based  information  or 

pixels  from  images.  It  starts  with  what  the  user  should  consider  before  sampling,  i.e.  the  target  population,  the 
sample  support  (volume  of  sample),  the  individuals,  what  the  data  are  to  be  used  for,  what  kind  of  predictions 
are  required.  Based  on  the  kind  of  predictions  required  the  user  will  decide  either  to  have  a  sample  design  for 
design-based  estimation  or  one  for  model-based  prediction. 
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PART  I:  MOVING  AVERAGES,  VARIANCES  AND  TILED  VARIOGRAMS  FOR 
FORT  A.  P.  HILL 


This  first  section  of  the  report  contains  several  sections  of  work.  The  analysis  and  computer 
programs  for  the  first  part  of  the  report  on  the  moving  averages,  variances  and  variograms  were 
completed  in  the  previous  project.  There  was  insufficient  time,  however,  to  complete  the  written 
part  of  this  work.  The  complete  section  is  now  included  although  files  of  the  results  and  computer 
programs  had  been  sent  to  TEC  in  digital  form.  Four  days  of  the  present  contract  were  spent 
working  at  the  Topographic  Engineering  Center  and  one  day  at  the  Virginia  Institute  of  Marine 
Science.  In  addition  time  has  been  spent  on  the  wavelets  and  kriging  paper  for  submission  to  the 
Proceedings  of  the  Sixth  International  Geostatistics  Congress.  This  paper  can  only  be  accessed  on 
CD-Rom  at  present,  the  published  proceedings  will  come  out  in  2001.  The  rest  of  this  report  will 
comprise  work  done  on  the  image  and  digital  elevation  model  for  Fort  A.  P.  Hill,  and  on  the  latest 
hyperspectral  imagery. 


Moving  averages,  variances  and  tiled  variograms 

The  computation  of  a  variogram  for  a  region  carries  with  it  the  implication  that  the  underlying 
variation  is  stationary  in  the  intrinsic  sense.  The  analyst  assumes  that  the  expected  differences 
between  places,  at  least  for  small  lag  distances,  are  zero  and  that  the  expected  squared  differences 
are  constant  for  any  given  lag: 

E[Z(x)-Z(x  +  h)]  =  0  (1) 

and  E[{Z(x)  -  Z(x  +  h)}2  ]  =  2^(h).  (2) 


Here,  in  the  usual  geostatistical  convention,  Z(x)  and  Z(x+h)  are  the  values  of  the  random  variable 
Zat  positions  x  and  x+h,  where  h  is  the  lag,  and  y(h)  is  the  semivariance  at  that  lag. 

The  equations  above  refer  to  the  random  process  Z(x),  and  in  any  one  realization  the  actual  values 
depart  from  its  expectation.  Geostatisticians  are  used  to  this  and  accept  Equation  (1)  without  demur 
in  many  instances.  Equation  (2)  is  assumed  to  hold  everywhere;  i.e.  the  semivariance  depends  on 
the  lag  only  and  not  on  the  position.  Geostatisticians  are  used  to  this  too;  usually  they  have  too  few 
data  to  explore  its  validity.  Remote  images  from  satellites,  each  comprising  several  hundred 
thousand  pixels,  however,  may  cause  us  to  question  the  assumption  when  some  parts  of  an  image 
are  plainly  more  variable  than  others.  The  SPOT  image  of  AP  Hill  is  one  such;  Figure  1,  a  pixel 
map  of  NIR,  is  an  example.  This  is  the  part  of  A.  P.  Hill  that  we  examined  originally.  With  so  many 
data  we  can  examine  the  changes  in  the  variogram  over  the  region,  and  the  result  can  help  us  to 
decide  whether  our  assumption  of  stationary  is  reasonable. 
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Figure  1.  Pixel  map  of  NIR  for  the  original  part  of  the  SPOT  image  for  Fort  A.  P.  Hill 
The  coordinates  are  given  as  UTMs. 


Analysis 

We  computed  three  sets  of  criteria  to  evaluate  the  likely  stationarity;  the  moving  average,  the 
moving  variance,  and  the  local  variogram.  The  first  two  are  defined  as  follows. 

Moving  average .  This  is  computed  by  placing  over  the  image  a  window  containing  n  pixels  and 
summing  the  values  of  NIR  within  the  window: 

^(x,)=“Zz(x/)-  (3) 

n  /=1 

where  z(x,)  is  the  value  of  the  ith  pixel  in  the  window,  and  xc  denotes  the  centre  of  the  pixel.  The 
window  is  moved  one  pixel  at  a  time,  and  at  each  new  position  jf(xc  )  is  computed.  In  this  way  a 
map  of  moving  averages  is  built. 

Moving  variance.  The  moving  variance,  (c?(xc),  is  computed  in  an  analogous  way  from 

^(xc)  =  _7lWx/)-^)}2-  (4) 

n  —  i  /=] 
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Again,  a  map  of  the  local  variances  can  be  made. 


Local  variogram.  Semivariances  can  be  calculated  for  a  window  from 


1  "7(h) 

y(h')=^£Wx')_z(I'+l,)}2-  (5) 

where  the  index  j  refers  to  those  pixels  for  which  a  paired  comparison  is  possible  at  the  particular 
lag  h,  and  m(h)  is  the  number  of  such  comparisons. 

Computing  local  variograms  for  the  same  windows  as  for  the  average  and  variance  runs  into  some 
difficulties. 

1.  Experimental  variograms  are  unreliable  with  fewer  than  about  100  data.  So  it  has  seemed 
unreasonable  to  attempt  to  compute  them  for  small  windows,  i.e.  less  than  10  x  10. 

2.  They  are  time-consuming,  though  not  so  time-consuming  as  to  make  the  task  impossible  on  a 
modern  workstation. 

3.  Displaying  the  hundreds  of  thousands  of  variograms  is  fraught. 

So,  instead  of  moving  the  window  one  pixel  at  a  time  we  divided  the  image  into  square  tiles 
without  any  overlap.  Also,  to  provide  an  intelligible  display  we  chose  tiles  of  15  x  15  =  225  pixels. 
From  experience  we  know  that  this  size  of  sample  estimates  the  local  variogram  well.  We  then 
computed  the  variogram  to  a  maximum  lag  of  12.5  pixels. 

Results 

We  computed  the  moving  averages  and  moving  variances  for  square  windows  of  sides  3,  5,  7,  9, 
11,  and  15  pixels.  The  results  are  displayed  in  Figures  2a  to  7a.  In  Figure  2a  (3  x  3  window)  the 
features  in  the  original  image.  Figure  1  are  still  visible.  For  example  the  lakes  in  the  north  and 
centre  (blue  areas)  and  the  line  of  the  road  running  N-S  (green).  The  latter  is  not  as  evident  as  in 
Figure  1.  In  Figure  3a  (5  x  5  window)  the  lakes  are  still  evident,  but  the  outline  is  no  longer  distinct 
and  the  line  of  the  road  is  barely  visible.  As  the  window  widens  further,  Figures  4a  to  7a,  the  coarse 
features  of  the  image  become  increasingly  apparent.  The  map  for  the  window  of  9  x  9  pixels  is  very 
similar  to  that  for  the  long-range  structure  from  factorial  kriging  (see  previous  final  report).  There 
is  little  evidence  of  non-stationarity. 

The  margins  of  the  maps,  which  broaden  as  the  window  increases  in  size,  are  the  global  average. 
This  value  is  given  when  the  window  dimensions  cannot  be  fitted  due  to  edge  effects. 

Analogous  maps  of  the  moving  variances  are  shown  in  Figures  2b  to  7b.  These  tell  a  different 
story. 
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The  smallest  window,  3x3=9  pixels,  shows  local,  mostly  sinuous,  patches  of  image  where  the 
variance  is  much  larger  than  the  general  background.  Figure  2b.  In  particular  the  variances  are  large 
at  the  margins  of  the  lakes,  and  their  shape  is  evident.  Increasing  the  window  to  25  pixels  leaves  a 
basically  sinuous  pattern,  but  with  the  patches  of  large  variance  beginning  to  broaden.  Figure  3b. 
This  broadening  continues  as  the  window  is  enlarged,  and  the  sinuosity  has  almost  disappeared  and 
is  confined  to  patches  of  medium  variation  with  the  window  of  81  pixels.  Figure  5b.  The  patches  of 
large  variance  appear  as  blobs,  as  they  do  with  the  wider  windows.  As  for  the  maps  of  the  moving 
averages,  the  uniform  bands  at  the  margins  are  of  the  global  variance. 

Figure  8  shows  the  variograms  in  their  corresponding  tiles,  and  so  is  effectively  a  map.  In  each  tile 
the  experimental  variogram  appears  as  the  set  of  points,  the  left-most  of  which  is  at  the  origin  and 
the  right-most  at  a  lag  of  12.5  pixels  is  scaled  so  that  it  lies  near  the  right  hand  margin  of  the  tile. 
The  ordinates  are  scaled  in  the  range  0  to  20  000  NIR2.  The  horizontal  line  through  the  points  is  the 
local  variance  of  the  data.  The  full  image  is  151  pixels  (columns)  178  (rows).  The  width  of  the  tiles, 
1 5  pixels,  will  not  divide  exactly  into  either,  and  so  there  are  gaps  at  the  right  and  top 
of  the  image. 

Over  most  of  the  image  the  local  variance  is  small,  and  it  is  difficult  to  see  what  autocorrelation 
there  is.  In  a  relatively  few  squares,  however,  the  variance  is  evidently  strongly  structured.  These 
squares  also  happen  to  be  the  ones  with  the  largest  variances,  and  perhaps  it  is  because  they  have 
the  large  variances  that  it  is  possible  to  display  the  results  in  an  informative  way. 
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a)  Moving  averages  for  block  side  3 
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b)  Moving  variances  for  block  side  3 
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Figure  2.  Pixel  maps  of  NIR  (Fort  A.  P.  Hill):  a)  moving  averages  and  b)  moving  variances 
for  blocks  of  side  3. 


a)  Moving  averages  for  block  side  7 
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b)  Moving  variances  for  block  side  7 
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Figure  4.  Pixel  maps  of  NIR  (Fort  A.  P.  Hill):  a)  moving  averages  and  b)  moving  variances 
for  blocks  of  side  7. 


a)  Moving  averages  for  block  side  9 


b)  Moving  variances  for  block  side  9 
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Figure  5.  Pixel  maps  of  NIR  (Fort  A.  P.  Hill):  a)  moving  averages  and  b)  moving  variances 
for  blocks  of  side  9. 


a)  Moving  averages  for  block  side  11 
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b)  Moving  variances  for  block  side  11 
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Figure  6.  Pixel  maps  of  NIR  (Fort  A.  P.  Hill):  a)  moving  averages  and  b)  moving  variances, 
for  blocks  of  side  1 1 . 
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a)  Moving  averages  for  block  side  15 
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b)  Moving  variances  for  block  side  15 
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Figure  7.  Pixel  maps  of  NIR  (Fort  A.  P.  Hill):  a)  moving  averages  and  b)  moving  variances, 
for  blocks  of  side  15 
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Figure  8.  Mosaic  of  variograms  that  correspond  to  the  square  windows  of  15  x  15  pixels. 
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Conclusions 

The  maps  of  moving  average  show  increasingly  clearly  the  coarse  features  of  the  image  as  the 
moving  window  is  widened.  There  is  little  evidence  of  trend  and  none  sufficient  to  suggest  that  the 
generating  process  is  non-stationary.  The  maps  of  moving  variance,  however,  reveal  small  patches 
where  the  variation  is  much  larger  than  over  most  of  the  image.  They  suggest  that  the  process  is  not 
stationary  locally  in  the  variance  and  that  the  intrinsic  hypothesis  should  not  be  assumed. 

The  steeply  sloping  local  variograms  confirm  this  view.  It  was  precisely  in  these  areas  of  local  non- 
stationarity  that  the  wavelet  analysis  performed  better  than  kriging  in  the  data  reconstruction 
(Oliver  et  al.  2000).  One  thing  that  we  shall  try  later  in  this  project  is  to  use  a  moving  variogram 
for  kriging  to  compare  with  the  wavelet  results. 


REPORT  ON  VISIT  BY  DR  OLIVER  TO  TEC  JUNE  14-21  2000 

Part  of  the  first  morning  was  spent  with  Mr  J.  Shine  and  Mr  E.  Bosch  discussing  how  the  time 

should  be  spent  during  the  visit.  Several  things  were  achieved  at  TEC  during  the  four  and  a  half 

days  that  I  was  there. 

1)  Mr  Shine  and  I  started  work  again  on  a  short  paper  for  the  International  Journal  of  Remote 
Sensing.  This  is  now  almost  complete  and  needs  the  final  conclusions  to  be  added.  The  figures 
are  now  complete;  the  final  variogram  was  fitted  by  a  triple  spherical  function.  This  paper  will 
be  ready  to  be  sent  to  the  Journal  in  the  near  future. 

2)  In  addition  I  went  through  various  Genstat  routines  with  Mr  Shine.  In  trying  to  krige  part  of  an 
image  the  program  gave  us  an  error  message  about  the  grid  size.  Since  my  return  to  England  I 
have  contacted  Rothamsted  Experimrntal  Station  where  Genstat  is  developed  and  I  was  told 
that  this  has  been  resolved  in  the  new  version  of  the  product  which  TEC  is  shortly  to  receive 
when  the  license  is  renewed. 

3)  In  discussing  the  results  of  comparing  the  digital  elevation  model  with  the  image  data  I 
discovered  that  the  original  data  for  elevation  were  available  without  having  been  smoothed  by 
some  form  of  interpolation.  We  extracted  those  using  Imagine  with  some  help  for  the  area 
where  we  have  been  working.  These  data  have  now  been  re-analysed,  but  there  is  little 
improvement  in  the  correlations. 

4)  Jim  Shine  and  I  were  fortunate  enough  to  have  a  meeting  with  Colonel  Jack  Marin  from  West 
Point.  He  showed  us  some  ways  of  dealing  with  hyperspectral  imagery  using  neural  networks. 
We  have  started  to  analyse  the  ‘hymap’  data  and  we  hope  that  the  variogram  will  identify  bands 
that  describe  different  ground  textures. 

5)  Mr  E.  Bosch  and  I  worked  on  the  soil  data  from  England  and  Wales.  He  has  done  a  wavelet 
analysis  of  a  selected  part  of  the  data  in  the  central  part  of  the  country.  I  did  the  factorial  kriging 
of  the  same  area  to  compare  the  wavelet  analysis  for  the  multispectral  analysis.  We  shall 
continue  with  this  work  when  Mr  Bosch  visits  England  in  August. 

6)  I  spent  one  day  at  the  Virginia  Institute  of  Marine  Science.  Kevin  Slocum  is  registered  as  a  PhD 
student  there  and  I  am  one  of  his  PhD  committee  members.  His  oral  examination  took  place  on 
June  16th.  It  was  successful  and  from  my  perspective  interesting.  In  the  afternoon  I  gave  a 
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lecture  which  included  some  general  geostatistics,  but  mainly  focused  on  the  work  that  I  have 
been  involved  in  at  TEC.  I  presented  the  factorial  kriging  analysis  of  Fort  A.P.Hill  and  also  the 
latest  work  on  wavelets.  The  ensuing  discussion  was  lively. 

One  outcome  of  this  visit  was  a  request  to  teach  a  short  geostatistics  course  there. 


FORT  A.  P.  HILL:  APHILLCUT 

In  the  previous  final  report  we  examined  the  digital  elevation  model  for  5-m  and  20-m  resolutions. 
Although  there  appears  to  be  a  relation  between  NIR  and  elevation  visually  this  was  not  proved  by 
the  correlation  coefficients  that  we  computed,  which  were  small.  Mr  Shine  and  Dr  Oliver  selected  a 
section  of  the  image  that  we  had  been  working  on  that  minimized  the  areas  of  hard-standmg 
because  we  felt  that  these  could  be  reducing  the  overall  correlation. 

Figure  9a  shows  pixel  map  of  NIR  for  the  part  of  the  image  called  aphillcut:  it  is  in  the  north 
western  part  of  the  original  image,  Figurel.  It  is  an  area  of  76  x  83  pixels  compared  with  the  151  x 
178  pixels  of  the  original  image:  it  is  half  the  area.  Figure  9b  shows  the  pixel  map  of  NDVI 
(Normalized  Vegetation  Index).  The  maps  show  similar  patterns  of  variation.  Variograms  were 
recomputed  and  modelled  for  NIR  and  NDVI,  Figure  10a  and  b,  respectively.  They  show  the 
experimental  semivariances  as  symbols  and  the  solid  lines  are  the  fitted  models.  Table  1  gives  the 
model  parameters.  They  were  both  fitted  best  by  nested  spherical  functions,  and  their  distance 
parameters  are  very  similar:  the  short-range  component  is  near  to  140  m  and  the  long-range  one  is 
about  230  m.  The  latter  is  shorter  than  that  for  the  larger  image  possibly  because  we  have  fitted  the 
model  to  a  shorter  overall  lag  distance  because  the  experimental  variograms  became  slightly 

irregular. 


Table  1.  Model  parameters  for  selected  properties 


Variable 

Model 

Model  parameters 

Co 

Cl 

Cl 

a\ 

#2 

a 

Sub- 

Stable 

0.00 

105.7 

5.653 

1.63 

sampled 

exponential 

3 

DEM 

NIR 

Double 

0.00 

115.4 

109.1 

6.577 

10.49 

spherical 

NDVI 

Double 

0.00 

6.940 

11.72 

spherical 

0.002707 

0.002187 

Note:  c0  is  the  nugget  variance,  c,  and  c2  the  sills  of  the  autocorrelated  variance,  ax  and  a2  the 
range  of  spatial  dependence  and  a  is  the  exponent  parameters  for  the  stable  exponential  model. 
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The  original  digital  elevation  model  for  this  area  was  at  a  5-m  resolution.  We  sub-sampled  this  to 
obtain  information  at  the  same  resolution  as  the  SPOT  image  (20-m).  At  the  time  that  these  data 
were  analysed  we  did  not  know  that  the  DEM  comprised  interpolated  data  from  30-m  resolution 
measurements  of  elevation.  Both  the  5-m  and  20-m  data  elevation  data  showed  strong  trend  as  for 
the  larger  area.  The  trend  was  modelled  by  linear,  quadratic  and  cubic  functions  fitted  to  the 
coordinates.  Table  2  shows  the  proportion  of  the  variance  accounted  for  by  the  three  functions  for 
both  data  sets.  Five  transects  along  rows  and  columns  of  the  data  were  selected  from  the  image  for 
a  wavelet  analysis  by  E.  Bosch.  These  were  also  examined  for  trend.  Table  3  gives  the  percentage 
variance  accounted  for  by  trend.  It  is  clear  that  the  trend  is  quite  variable  from  place  to  place. 


Table  2.  Trend  analysis  for  the  original  and  sub-sampled  DEM  data. 


%  variance  accounted  for 
Linear _ Quadratic  Cubic 


Full  DEM  data  15.7%  31.7%  41.7% 

Sub-sampled  DEM*  14.5%  30.2%  40.4% 


*  DEM  data  was  sub-sampled  to  match  the  co-ordinates  of  the  spectral  values  from  the  SPOT  data. 


Table  3.  Trend  analysis  for  transects  taken  from  the  full  DEM  data. 


Transect 

%  of  variance  accounted  for  by 
trend 

Transect  1 :  along  y  axis 

5.8 

Transect  2:  along  y  axis 

41.5 

Transect  3:  along  x  axis 

16.5 

Transect  4:  along  x  axis 

21.3 

Transect  5:  along  y  axis 

30.1 

The  variogram  was  computed  for  the  20-m  elevation  data  using  the  residuals  from  the  cubic  trend 
function.  Figure  11a  shows  the  experimental  variogram.  It  is  upwardly  concave  near  to  the  origin 
and  the  usual  models  that  we  fit  were  not  appropriate.  Since  we  consider  the  Gaussian  model  to  be 
unreliable  we  have  fitted  a  stable  exponential  function  whose  equation  is  given  by 


y(h)  =  cs  1  -  exp 


( 

\  ra) 
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Figure  1 1.  a)  Experimental  variogram  for  the  elevation  data  for  ‘aphillcut’,  b)  the  fitted  stable 
exponential  model  where  the  exponent  was  1.6. 


where  c  is  the  sill  variance,  h  is  the  lag  and  a  is  an  exponent  that  must  be  less  than  2.  Figure  1  lb 
shows  the  experimental  semivariances  and  the  fitted  stable  exponential  model  for  elevation.  Table  1 
gives  the  model  parameters.  The  approximate  working  range  for  the  exponential  function  is  340  m. 

To  use  this  variogram  the  kriging  program  had  to  be  amended,  as  we  have  not  used  it  before  for 
kriging.  Kriging  was  done  on  the  residuals  (for  the  20-m  data)  because  the  presence  of  trend 
violates  the  assumptions  of  geostatistics.  After  kriging  the  trend  was  added  back  to  the  estimates 
using  the  residuals.  Figure  12  shows  the  kriged  estimates  of  elevation  for  the  ‘aphillcut’  area.  A 
visual  inspection  of  the  maps  for  NIR  (Figure  9a)  and  Figure  12  shows  that  the  valley  systems  in 
the  DEM  correspond  with  the  large  NIR  values,  while  the  interfluves  have  intermediate  NIR 
values. 

For  the  correlation  analysis  the  raw  elevation  data  on  a  30  m  grid  were  used,  Figure  13.  They  were 
compared  with  coincident  pixel  information  for  NIR  and  NDVI,  Figure  14a  and  b.  Figure  15a  and  b 
shows  the  maps  of  the  differences  between  the  elevation  data  and  the  pixel  information  from  the 
SPOT  image.  Table  4  shows  that  these  correlations  between  the  raw  data  are  small.  Since  we  know 
from  experience  that  local  noise  in  data  can  sometimes  obscure  relations  between  variables  we 
computed  the  correlation  coefficients  between  the  punctually  and  block  kriged  estimates  and 
moving  averages  for  the  DEM,  NIR  and  NDVI.  Figures  16  and  17  show  the  moving  averages  for 
these  variables  for  the  10  by  10  window.  The  strongest  relation,  albeit  still  weak,  is  between  the 
DEM  and  NDVI  for  the  moving  average  computed  with  a  window  of  10  x  10  pixels.  It  is  evident 
that  the  visual  relation  between  the  observed  pattern  in  the  image  data  and  the  DEM  does  not 
appear  to  be  as  strong  statistically  as  visually. 
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Figure  12.  Map  of  punctually  kriged  estimates  of  elevation  using  the  stable  exponential  model  for 
‘aphillcut’. 
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Figure  13.  Pixel  map  of  original  (raw)  elevation  values  on  the  30  m  grid  for  ‘aphillcut’. 
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Figure  17.  Pixel  map  of  the  moving  average  for  a  block  of  10  x  10  pixels  for  a)  NIR  and  b)  NDVI, 
for  ‘aphillcut’. 
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Table  4.  Correlation  coefficients  for  DEM,  NIR  and  NDVI. 

Property  1 

Property  2 

Correlation 

coefficient 

DEM 

NIR 

-0.172 

DEM 

NDVI 

-0.175 

DEM  residuals 

NIR 

-0.034 

DEM  residuals 

NDVI 

-0.042 

Block  kriged  DEM 

Block  kriged  NIR 

0.235 

Block  kriged  DEM 

Block  kriged  NDVI 

0.219 

DEM  moving  average: 
block  side  of  3 

NIR  moving  average: 
block  side  of  3 

-0.197 

DEM  moving  average: 
block  side  of  3 

NDVI  moving  average: 
block  side  of  3 

-0.203 

DEM  moving  average: 
block  side  of  10 

NIR  moving  average: 
block  side  of  10 

-0.315 

DEM  moving  average: 
block  side  of  10 

NDVI  moving  average: 
block  side  of  10 

-0.330 

Summary 

The  purpose  of  this  detailed  investigation  between  the  DEM  information  and  the  image  data  was  to 
assess  the  strength  of  their  coregeonalization.  If  it  had  been  stronger  it  could  have  been  used  to 
restore  compressed  image  information  using  cokriging.  The  coregionalization  is  too  weak  to  exploit 
in  this  area,  but  it  might  be  worth  considering  elsewhere. 
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PART  II  HYPERSPECTRAL  IMAGERY 

The  hyperspectral  imagery  of  part  of  Fort  A.  P.  Hill,  Virginia,  was  sent  as  ‘hymap’  data 
in  May.  However,  some  errors  were  found  in  the  original  data  after  the  analysis  presented 
in  the  first  interim  report.  Therefore,  that  analysis  has  been  completely  redone  as  part  of 
the  new  analysis.  The  spectral  information  had  been  subdivided  into  126  bands,  and  the 
pixel  size  is  7  m  x  7m.  The  area  examined  in  the  report  is  a  subset  of  the  SPOT  image 
examined  previously.  It  corresponds  with  ‘aphillcuf ,  which  is  a  subset  of  the  SPOT 
image  used  for  making  comparisons  with  the  digital  elevation  model  (DEM)  in  more 
detail  (see  pages  15  to  22).  Our  aim  in  this  investigation  is  to  determine  whether  some  of 
the  bands  duplicate  information  using  principal  components  analysis,  a  non-hierarchical 
classification,  the  variogram,  pixel  maps  and  factorial  kriging. 

Correlation  matrix 

The  correlation  matrix  was  computed  as  part  of  the  principal  component  analysis  on  a 
reduced  data  set  of  one  pixel  in  a  block  of  9.  Appendix  1  gives  the  correlation  matrix. 
There  are  groups  of  bands  with  correlations  greater  that  0.8  showing  strong  relations 
among  the  bands.  The  most  obvious  groups  that  emerge  are:  bands  1  to  18,  20  to  61,  96 
to  125,  which  is  also  correlated  with  bands  1  to  18,  and  66  to  94.  It  is  difficult  to  identify 
more  subtle  groupings,  but  those  identified  emerge  in  other  analyses  described  later. 

Is  it  also  likely  that  these  will  contain  similar  spatial  information?. 


Principal  Components  Analysis 

Prinicpal  components  analysis  (PCA)  is  a  multivariate  analysis  that  aims  to  reduce  the 
dimensionality  of  the  data  to  explore  the  relations  among  variables.  For  this  reason  it  has 
been  widely  used  for  exploring  hyperspectral  imagery  as  a  means  of  limiting  the  number 
of  bands  used  for  further  analysis.  However,  this  approach  might  not  be  entirely 
satisfactory  because  some  potentially  valuable  information  might  be  excluded  from 
further  analysis  (Green  et  al.,  1988;  Chang  and  Du,  1999). 

The  PCA  was  done  on  the  correlation  matrix  between  all  of  the  bands  (see  Appendix  1). 
The  questions  posed  are: 

Can  the  number  of  bands  containing  relevant  information  be  reduced  by  examining  the 
correlation  matrix  or  by  a  PCA? 

Do  the  strongly  correlated  bands  imply  that  from  a  spatial  context  these  are  mainly 
redundant  and  that  we  can  look  at  one  to  obtain  the  information  of  value? 

We  shall  examine  some  of  the  strongly  correlated  bands  to  assess  whether  this  is  so. 

Tables  5  and  6  give  the  results  of  the  principal  component  analysis  (PCA).  Table  5  gives 
the  latent  roots  or  eigenvalues  of  the  first  ten  principal  axes.  The  first  two  account  for  just 
over  90%  of  the  variation,  Table  5.  This  is  an  expression  of  the  extent  of  correlation 
between  the  wavebands.  Table  6  gives  the  latent  vectors  for  the  first  five  principal  axes 
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only  since  those  for  roots  6  to  10  account  for  so  little  of  the  variation.  For  PCI  bands  18 
to  19,  61  to  62,  66  to  94  and  99  to  122  have  large  latent  vectors.  For  PC2  bands  20  to  59 
have  large  values,  and  for  PC3  it  is  bands  1  to  13.  For  PC4  bands  63  to  66,  and  band  95 
have  large  latent  vectors  and  for  PC5  it  is  band  126.  For  PCs  6  to  10  the  wavebands  that 
have  large  values  for  PCs  4  and  5  dominate;  little  new  information  appears  to  emerge. 
The  limiting  values  of  the  latent  vectors  have  been  chosen  by  eye,  based  on  the  level  at 
which  there  is  an  abrupt  change  to  a  smaller  value. 

Figure  17  shows  the  spectrum  of  a  range  of  ground  cover  types.  In  general  they  have  a 
similar  form  with  distinct  changes  occurring  at  certain  points  in  the  spectrum  for  most 
ground  cover  types.  It  is  interesting  to  observe  that  boundaries  or  changes  occur  at 
waveband  positions:  13,  18,  35  to  37,  47,  58,  60,  63,  70  and  94.  These  have  some 
association  with  the  groupings  of  the  wavebands  that  have  large  latent  vectors  for  PCs  1 
and  2,  and  is  what  we  might  expect. 


Table  5  Selected  latent  roots  or  eigenvalues  of  the  correlation  matrix  for  the 
hyperspectral  image  data 


Latent  Latent  roots  %  variance 


Roots 


1 

83.65 

66.39 

2 

33.40 

26.51 

3 

3.90 

3.09 

4 

1.93 

1.53 

5 

0.71 

0.56 

6 

0.41 

0.33 

7 

0.37 

0.30 

8 

0.29 

0.23 

9 

0.26 

0.20 

10 

0.20 

0.16 

Table  6.  Latent  vectors  or  eigenvectors  of  the  correlation  matrix  for  the 
hyperspectral  image  datafor  the  first  five  principal  components. 
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PCI 

PC2 

PC3 

PC4 

PC5 

1.0000 

2.0000 

3.0000 

4.0000 

5.0000 

1 

-0.0782 

-0.0654 

0.2224 

0.0912 

-0.0746 

2 

-0.0764 

-0.0900 

0.2281 

0.0312 

-0.0266 

3 

-0.0773 

-0.0928 

0.2191 

0.0249 

-0.0165 

4 

-0.0779 

-0.0945 

0.2117 

0.0196 

-0.0115 

5 

-0.0791 

-0.0944 

0.2071 

0.0142 

-0.0184 

6 

-0.0830 

-0.0857 

0.2101 

0.0111 

-0.0358 

7 

-0.0894 

-0.0636 

0.2157 

0.0071 

-0.0694 

8 

-0.0924 

-0.0507 

0.2129 

0.0044 

-0.0818 

9 

-0.0928 

-0.0526 

0.2039 

0.0057 

-0.0744 

10 

-0.0905 

-0.0682 

0.1914 

-0.0003 

-0.0731 

11 

-0.0889 

-0.0780 

0.1774 

-0.0053 

-0.0646 

12 

-0.0888 

-0.0823 

0.1626 

-0.0054 

-0.0525 

13 

-0.0883 

-0.0869 

0.1474 

-0.0040 

-0.0395 

14 

-0.0873 

-0.0910 

0.1360 

-0.0159 

-0.0577 

15 

-0.0863 

-0.0959 

0.1224 

-0.0122 

-0.0309 

16 

-0.0855 

-0.0993 

0.1101 

-0.0124 

-0.0156 

17 

-0.0875 

-0.0956 

0.1022 

-0.0115 

-0.0139 

18 

-0.1008 

-0.0423 

0.1097 

-0.0188 

-0.0791 

19 

-0.0994 

0.0472 

0.1084 

-0.0032 

-0.0731 

20 

-0.0802 

0.1091 

0.1058 

0.0129 

-0.0250 

21 

-0.0681 

0.1298 

0.1019 

0.0254 

0.0113 

22 

-0.0641 

0.1352 

0.0955 

0.0309 

0.0278 

23 

-0.0628 

0.1369 

0.0949 

0.0265 

0.0308 

24 

-0.0630 

0.1373 

0.0898 

0.0281 

0.0330 

25 

-0.0633 

0.1374 

0.0852 

0.0273 

0.0337 

26 

-0.0637 

0.1373 

0.0806 

0.0250 

0.0345 

27 

-0.0636 

0.1377 

0.0766 

0.0252 

0.0363 

28 

-0.0634 

0.1382 

0.0737 

0.0254 

0.0385 

29 

-0.0635 

0.1382 

0.0708 

0.0233 

0.0376 

30 

-0.0641 

0.1379 

0.0668 

0.0242 

0.0386 

31 

-0.0641 

0.1378 

0.0666 

0.0243 

0.0388 

32 

-0.0650 

0.1368 

0.0589 

0.0274 

0.0430 

33 

-0.0661 

0.1358 

0.0591 

0.0140 

0.0332 

34 

-0.0674 

0.1344 

0.0482 

0.0224 

0.0413 

35 

-0.0700 

0.1317 

0.0441 

0.0123 

0.0299 

36 

-0.0709 

0.1303 

0.0406 

0.0053 

0.0242 

37 

-0.0712 

0.1302 

0.0373 

0.0036 

0.0250 

38 

-0.0710 

0.1309 

0.0340 

0.0079 

0.0316 

39 

-0.0703 

0.1317 

0.0322 

0.0112 

0.0366 

40 

-0.0692 

0.1331 

0.0343 

0.0097 

0.0378 

41 

-0.0686 

0.1339 

0.0348 

0.0077 

0.0362 

42 

-0.0682 

0.1344 

0.0333 

0.0081 

0.0389 

43 

-0.0681 

0.1345 

0.0327 

0.0060 

0.0382 

44 

-0.0686 

0.1338 

0.0286 

0.0077 

0.0399 

45 

-0.0699 

0.1322 

0.0235 

0.0090 

0.0413 

46 

-0.0718 

0.1294 

0.0180 

0.0112 

0.0406 

47 

-0.0762 

0.1237 

0.0085 

-0.0039 

0.0181 

48 

-0.0810 

0.1155 

-0.0057 

-0.0128 

0.0058 

49 

-0.0837 

0.1102 

-0.0139 

-0.0172 

-0.0052 

50 

-0.0849 

0.1082 

-0.0189 

-0.0154 

-0.0041 

24 


51 

-0.0861 

0.1060 

-0.0256 

-0.0111 

-0.0004 

52 

-0.0863 

0.1054 

-0.0262 

-0.0135 

-0.0037 

53 

-0.0857 

0.1067 

-0.0266 

-0.0149 

-0.0008 

54 

-0.0848 

0.1084 

-0.0275 

-0.0124 

0.0044 

55 

-0.0844 

0.1091 

-0.0298 

-0.0093 

0.0095 

56 

-0.0842 

0.1093 

-0.0306 

-0.0116 

0.0093 

57 

-0.0846 

0.1084 

-0.0320 

-0.0147 

0.0053 

58 

-0.0856 

0.1062 

-0.0371 

-0.0142 

0.0073 

59 

-0.0873 

0.1022 

-0.0433 

-0.0156 

0.0069 

60 

-0.0898 

0.0961 

-0.0516 

-0.0176 

0.0056 

61 

-0.0923 

0.0883 

-0.0606 

-0.0131 

0.0126 

62 

-0.0920 

0.0845 

-0.0660 

0.0005 

0.0333 

63 

-0.0222 

-0.0092 

-0.0652 

0.5980 

-0.1605 

64 

-0.0386 

-0.0201 

-0.0758 

0.5664 

-0.1191 

65 

-0.0735 

-0.0417 

-0.0876 

0.3890 

-0.0471 

66 

-0.0947 

-0.0524 

-0.0924 

0.1789 

0.0144 

67 

-0.1007 

-0.0548 

-0.0816 

0.0605 

0.0059 

68 

-0.1020 

-0.0508 

-0.0810 

0.0524 

0.0018 

69 

-0.1035 

-0.0446 

-0.0844 

0.0201 

0.0216 

70 

-0.1046 

-0.0385 

-0.0856 

-0.0049 

0.0086 

71 

-0.1056 

-0.0313 

-0.0868 

-0.0192 

-0.0124 

72 

-0.1063 

-0.0242 

-0.0887 

-0.0234 

-0.0246 

73 

-0.1067 

-0.0176 

-0.0902 

-0.0234 

-0.0337 

74 

-0.1070 

-0.0118 

-0.0904 

-0.0260 

-0.0426 

75 

-0.1072 

-0.0068 

-0.0899 

-0.0299 

-0.0534 

76 

-0.1072 

-0.0025 

-0.0912 

-0.0266 

-0.0524 

77 

-0.1072 

0.0010 

-0.0914 

-0.0273 

-0.0569 

78 

-0.1070 

0.0038 

-0.0937 

-0.0240 

-0.0548 

79 

-0.1069 

0.0063 

-0.0936 

-0.0274 

-0.0619 

80 

-0.1068 

0.0082 

-0.0937 

-0.0288 

-0.0651 

81 

-0.1068 

0.0091 

-0.0933 

-0.0277 

-0.0679 

82 

-0.1068 

0.0091 

-0.0916 

-0.0286 

-0.0718 

83 

-0.1068 

0.0087 
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Band 


- Profile  1:  Urban 

- Profile  2:  Urban/bare  soil 

- Profile  3:  Sparse  vegetation 

— —  Profile  4:  Vegetation  trees 

- Profile  5:  Vegetation  trees 

- Profile  6:  Vegetation 

- Profile  7:  Vegetation-  less  dense 

- Profile  8:  Bare  soil/regrowth 

Figure  1 8  Spectral  information  from  the  hymap  image  of  part  of  A.  P.  Hill  for  six  ground 
cover  types. 


Non-hierarchical  classification 

For  this  multivariate  classification  (see  Webster  and  Oliver,  1990)  there  is  no  assumption 
that  there  is  any  hierarchy  in  the  multivariate  structure  of  the  data.  The  majority  of 
multivariate  methods  of  classification  are  based  on  a  hierarchy  of  classes,  such  as  nearest 
neighbour,  median  clustering  etc.  Since  there  is  no  natural  hierarchical  structure  in  the 
spectral  data  the  non-hierarchical  method  was  used.  The  subdivision  occurs  at  one  level 
only,  and  the  method  is  often  known  as  dynamic  clustering.  Essentially  the  aim  is  to 
minimize  the  variation  within  groups  and  maximize  that  between  them  based  on  a 
selected  mathematical  criterion.  These  include  the  sum  of  squares  (SSP)  within  classes, 
Wilks’  criterion  or  the  trace  W  *B,  where  W  is  the  SSP  matrix  within  groups  and  B  is  the 
between-groups  SSP  matrix.  The  chosen  test  criterion  is  calculated  after  an  arbitrary 
partition  of  the  data  (or  a  pre-existing  classification  can  be  used  if  available),  and 
individuals  are  moved  from  group  to  group  and  the  criterion  calculated  afresh  each  time. 
The  procedure  continues  iteratively  until  no  improvement  seems  possible.  In  Genstat 
(Genstat  5  Committee,  1993)  local  optimal  solutions  in  the  classification  are  avoided  by 
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changing  to  moving  pairs  of  individuals  and  then  returning  to  single  ones  again  (see 
Webster  and  Oliver,  1990  for  more  detail). 

For  this  analysis  I  wanted  to  classify  the  wavebands  and  not  the  pixels,  therefore  I 
selected  the  bands  for  just  52  pixels  scattered  over  the  image.  These  data  were  then 
reorganised  so  that  the  bands  appeared  as  the  individuals  and  the  pixels  as  the  variables.  I 
used  the  three  criteria  given  above,  but  the  results  for  the  sum  of  squares  within  criterion 
was  the  only  one  to  give  a  satisfactory  result.  This  was  because  of  the  degree  of 
correlation  in  the  data.  To  avoid  possible  problems  from  local  optima  I  started  with  15 
classes  and  went  in  steps  of  one  to  six  classes.  The  criterion  value  was  plotted  against  the 
number  of  classes,  Figurel9.  The  class  at  which  the  greatest  change  in  the  criterion  value 
occurred  was  selected.  This  was  for  8  classes  in  this  analysis.  The  wavebands  belonging 
to  each  class  are  given  in  Table  9.  They  show  some  relation  with  the  bands  associated 
with  the  first  5  PCs,  and  also  with  the  major  changes  in  the  spectrum,  Figurel8.  It  is 
evident  at  this  early  stage  of  the  analysis  that  some  patterns  are  emerging. 


0 


Figure  19.  Sum  of  squares  within  criterion  plotted  against  group  size,  g. 
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Variography 

Experimental  variograms  were  computed  from  the  full  pixel  information  for  all 
wavebands.  Figures  20  to  30.  Several  bands  have  variograms  with  similar  forms.  Overall 
there  appear  to  be  six  main  shapes  of  variogram  evident,  disregarding  the  size  of  the 
nugget  and  sill  variances.  The  variograms  within  the  following  groups  of  bands  appear  to 
be  similar:  bands  (a)  1  to  17;  (b)  20  to  62;  (c)  63  to  64;  (d)  19,  65,  71  to  94, 126;  (e)  66  to 
70,  95;  (f)  96  to  125,  18  (see  also  Table  9).  These  show  some  relation  with  the  groups 
identified  by  the  non-hierarchical  classification  and  the  PCA.  Bands  63  and  64  have  pure 
nugget  variograms,  which  are  different  from  all  of  the  others,  and  bands  95  and  126  also 
have  variograms  that  are  different  which  also  emerges  in  the  results  of  the  PCA. 

The  experimental  variograms  of  bands  1  to  18  and  65  to  126  have  a  wavelike  form  which 
suggests  that  there  is  some  repetition  in  the  variation  at  a  distance  of  about  1000  m.  We 
can  be  fairly  certain  that  this  is  not  a  sampling  effect  because  of  the  large  size  of  the  data 
set.  This  repetition  is  likely  to  relate  to  the  nature  of  the  physiography  in  the  region,  for 
example  the  two  large  valley  systems  (see  Figure  12). 

Experimental  variograms  were  also  computed  from  the  scores  of  the  first  five  principal 
components  (Figure  31).  The  variograms  for  PCI  to  PC3  resemble  some  of  the  main 
structural  features  seen  in  the  individual  variograms,  but  not  precisely  so.  The 
experimental  variograms  for  PC4  is  pure  nugget,  which  resembles  the  variograms  for 
bands  63  and  64,  which  also  load  heavily  on  this  component.  Therefore,  we  can  interpret 
this  component  as  containing  no  spatial  information.  The  variogram  for  PC5  shows  some 
structure,  albeit  little.  It  has  a  form  similar  to  the  variogram  for  band  126,  which  also 
loads  heavily  on  PC5. 


Overall  the  variograms  suggest  that  there  is  little  noise  in  the  wavebands,  apart  from 
bands  63,  64  and  126.  This  is  of  interest  since  most  variograms  show  strong  spatial 
structure.  The  PCA  also  identified  these  wavebands  that  exhibit  little  structure. 
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Figure  23.  Experimental  variograms  for  bands  37  to  48. 
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Figure  24.  Experimental  variograms  for  bands  49  to  60. 
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Figure  27.  Experimental  variograms  for  bands  85  to  96 
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Figure  30.  Experimental  variograms  for  bands  121  to  126 


Figure  31.  Experimental  variograms  for  principal  components  1  to  5 
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Variogram  modelling 


Mathematical  models  were  fitted  to  the  experimental  variograms  of  each  band  and  the 
principal  components.  Nested  models,  either  spherical  or  exponential  provided  the  best  fit 
to  most  of  the  variograms  in  the  least  squares  sense.  No  attempt  was  made  to  model  the 
periodicity  as  the  wavelength  was  fairly  constant  for  those  variograms  showing  signs  of 
repetition.  The  variograms  of  the  bands  and  the  first  five  principal  components  were 
modelled  to  a  lag  distance  that  enabled  a  stable  fit,  lags  of  60,  90,  95  and  100  were  used. 
Figures  32  to  43  show  the  experimental  variograms  as  symbols  and  the  fitted  models  as 
lines.  It  is  difficult  to  distinguish  between  the  experimental  values  and  the  models  in 
many  instances  because  of  the  density  of  the  experimental  values  and  the  good  fit  of  the 
models. 

Table  7  gives  the  parameters  of  the  fitted  models,  and  Table  8  the  summary  statistics  of 
the  models  fitted  to  the  wavebands  (not  for  the  models  fitted  to  the  PCs).  Apart  from 
bands  63  and  64  which  are  pure  nugget  the  models  are  nested  exponential  or  spherical 
functions.  Many  models  have  no  nugget  variance,  but  for  others  it  is  large.  To  summarise 
the  features  of  the  models  the  pure  nugget  variograms  have  been  removed,  but  they  were 
included  for  later  analyses.  There  are  also  considerable  differences  in  the  sill  variances  of 
the  models  even  within  groups  of  variograms  that  have  similar  structures.  For  the 
exponential  models  the  distance  parameter  has  been  multiplied  by  3  in  Table  7  to  give  the 
approximate  range  of  spatial  dependence.  The  first  distance  parameter  of  the  nested 
functions  is  remarkably  similar  throughout.  It  has  a  small  range  of  between  25.2  m  (7  x 
3.6)  for  band  20  and  43  m  (7  x  6.14  for  band  95).  The  average  is  34.3  m  for  the 
wavebands  and  for  PCI  it  is  35.5  m  (1.69  x  21).  This  corresponds  closely  with  the  short- 
range  structure  identified  from  the  1  m  imagery  analysed  and  described  in  project  report 
N68 171 -97-C-9092.  There  is  much  more  variation  in  the  size  of  the  long-range  structure. 
The  smallest  value  for  the  long-range  component  is  185.5  m  (7  x  26.5)  and  this  extends 
to  a  maximum  of 994  m  (7  x  142),  which  corresponds  to  the  size  of  one  of  the  structures 
identified  in  the  SPOT  image  for  this  area.  The  average  long-range  component  is  422.6  m 
(7  x  60.37). 

The  average  nugget  variance  (excluding  the  values  for  bands  63  and  64)  is  91 1,  with  a 
minimum  of  zero  and  a  maximum  of  33249.  The  average  sill  variance  of  the  first 
structure  is  128862.5,  with  a  range  of  682  to  589718,  which  is  considerable.  For  the 
second  structure  the  average  sill  variance  is  1 13863.8,  with  a  minimum  of  91 1  and  a 
maximum  of  464529,  which  is  slightly  less  than  for  the  first  structure.  It  is  evident  from 
Figures  32  to  43  that  the  overall  sill  variance  varies  considerably,  whereas  the  long-range 
component  is  more  consistent  within  the  groups  that  can  be  identified  visually  from  the 
experimental  variograms,  Figures  20  to  30. 


Bands  1  to  17  and  70  to  73  have  a  consistent  long-range  component  of  about  475  m, 
which  is  close  to  that  for  PCI,  435.5  m.  For  bands  19,  74  to  94  the  long-range 
component  is  about  390  m,  for  bands  20  to  62  it  is  about  225  m,  and  for  bands  65  to  69  it 
is  just  over  500  m.  For  bands  96  to  101  and  1 15  to  125,  and  PC5  it  is  about  700  m.  For 
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bands  102  to  1 1 5  it  is  about  600  m,  and  for  bands  95  and  1 26,  and  PCc  2  and  3  it  is  about 
900  m.  It  is  evident  that  from  this  crude  grouping  based  on  the  long-range  component  that 
eight  groups  emerge,  if  the  pure  nugget  variograms  are  considered  as  a  group  on  their 
own.  Table  9.  This  is  the  same  number  of  groups  as  for  the  classification  of  the  raw  data, 
Table  9.  It  is  interesting  to  note  that  there  is  an  order  in  the  values  of  the  long-range 
component  with  the  order  of  the  wavebands  —  sometimes  decreasing  and  at  others 
increasing. 

The  forms  of  the  variograms  of  the  PCs  appear  to  be  similar  to  groups  of  variograms. 
Figure  43.  This  relates  more  to  the  long-range  component  than  to  the  short-range 
component, ,  except  for  PCI.  The  relations  between  the  PC  models  and  groups  of 
wavebands  is  described  above.  Since  the  PCA  was  done  a  sample  of  the  data  (1  in  9 
pixels)  the  values  for  the  range  have  been  multiplied  by  2 1  to  obtain  the  values  in  metres. 

To  classify  the  model  parameters  in  the  same  way  as  that  for  the  raw  digital  data  the 
parameters  were  standardized  to  zero  mean  and  unit  variance.  This  is  because  the 
different  scales  on  which  the  parameters  have  been  measured  would  give  too  much 
weight  to  the  sill  and  nugget  variances.  The  standardized  model  parameters  were 
classified  in  the  way  described  above  and  the  number  of  classes  selected  was  based  on 
the  group  with  largest  change  in  the  criterion.  As  far  as  I  know  the  classification  of  the 
model  parameters  of  the  wavebands  is  novel.  Table  9  gives  the  results  of  the 
classification  for  an  optimal  classification  with  eight  classes. 
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Figure  32:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  1  to  12. 
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Figure  33:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  13  to  24 
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Figure  34:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  25  to  36 
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Figure  35:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  37  to  48 
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Figure  36:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  49  to  60- 
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Figure  37:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  61  to  72 
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Figure  38:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  73  to  84 
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Figure  39:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  85  to  96 
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Figure  40:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  97  to  108 
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Figure  41:  Experimental;  variograms  (symbols)  and  fitted  models  (lines)  for  bands  109  to  120 
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Figure  42:  Experimental  variograms  (symbols)  and  fitted  models  (lines)  for  bands  109  to  126 
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Figure  43:  Experimental  variograms  (symbols)  and  fitted  models  (line)  for  PCs  1,  2,  3  and  5 
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Table  7.  Model  parameters  fitted  to  the  individual  wavebands 


Band!  Lag 

Model 

Nugget 

Sill  1 

Sill  2  | 

Range  1  j 

Range  2j 

i  Distance  i 

type 

variance 

variance 

variance  j 

/7m 

/7m 

1 

630  m| 

Double  (D)  S 

2,194 

2,454 

■sm 

4.541  70.9 

« 

D  Spherical(S) 

763’ 

2,168’ 

6,726 ! 

4.51 1  69.8 

3 

ii 

DS 

931! 

2,829 

8,707 

4.581 

69.4 

4 

ii 

DS 

1 ,235  • 

3,756 

11,422 

4.64 j 

69.3 

5 

ii 

DS 

1,526 

4,688 

14,183 

4.63 

69.4 

6  ; 

ii 

DS 

1,672 

5,109 

15,407 

4.43 

69.7 

7 

'» 

DS 

1,776 

5,513 

15,023: 

3.91 

69.0 

8 

ii 

DS 

2,059’ 

5,973' 

14,718 

3.81 

68.1 

9 

" 

•  DS 

2,133 

6,197 

14,881 

3.83 

67.7 

10 

n 

DS 

1,943’ 

6,123’ 

15,361 

4.03 

67.9 

ii 

ti 

DS 

2,218 

6,325 

16,241 

4.54 

68.2 

12 

•i 

DS 

2,369 

7,058 

17,716 

4.76 

68.0 

13 

630  m 

DS 

2,530 

7,910 

19,589 

4.94 

67.8 

14 

ii 

DS 

2,837’ 

8,751 

20,991 

5.31 

68.0 

15 

ti 

DS 

3,274 

1 0,420 : 

24,326 

5.48 

68.1 

16 

•i 

D  Exponential 

3,968 

12,853 

28,978 

5.59 

68.2 

17 

n 

DE 

4,617 

14,654 

31,740 

5.69 

68.0 

18 

ii 

DE 

6,55b1 

17,357 

25,291 

5.47 

62.8 

19 

420  m 

DE 

1,877 

46,309 

30,147 

4.05 

58.8 

20 

420  m 

DE 

0 

165,304’ 

64,437 

3.60 

37.5 

. 21 

DE 

0 

163,557 

117,817 

3.69 

37.1 

22 

DE 

0 

196,399 

153,169 

3.80 

37.4 

. 23 . 

420  m 

DE 

0 

207,745 

165,444 

3.82 

37.2 

24 

DE 

0 

219,071! 

175,665 

3.85 

36.9 

25 

DE 

0 

228,765 

183,876 

3.87 

36.5 

26 

DE 

0 

237,056 

192,098 

3.92 

36.2 

27 

DE 

0 

252,297 

206,424 

3.92 

35.6 

28 

DE 

0 

256,250 

211,930 

4.01 

35.5 

29 

DE 

. 0 

264,459 

218,230 

3.97 

35.0 

30 

DE 

0 

260,462 

215,512 

4.00| 

34.7 

31 

DE 

0 

. '259“435 

214,695 

4.01 ! 

34.7 

32 

420  m 

DE 

o 

274,028 

. 228,245 

4.06 

34.4 

33 

DE 

0 

259,864 

213,750 

3.96 

33.4 

. 34 . 

DE 

i  o 

283,580 

234,582 

4.07 

33.5 

35 

DE 

0 

258,307 

211,490 

4.04 

32.7 

36 

DE 

0 

. 255,783 

208,660 

4.04 

32.3 

37 

DE 

0 

267,707 

218,144 

4.04 

32.6 

38 

DE 

. 6 

283,374 

231 ,956 

4.10 

31.9 . 

39 

DE 

0 

299,100 

246,504 

4.14 

31.8: 

40 

DE 

0 

301,733 

248,919 

4.21 

31.7 

41 

DE 

7 . 6 

316,027 

259,568 

4.08 

31.4 

42 

420  m 

DE 

6 

325,911 

268,242 

4.12 

31.3 

43 

DE 

6 

328,631 

270,379 

4.14 

31.1 

44 

DE 

0 

338,345 

278,064 

4.13 

30.8 

""“45 . 

DE 

6 

341,894 

280,043 

. 4.13 

30.4 

46 

DE 

0 

334,013 

273,185 

4.21 

30.1 

47 

420m 

DE 

6 

286,330 

230,068 

4.11 

29.1 
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Band 

Lag 

Model 

Nugget 

Sill  1 

Sill  2 

Range  1 

Range  2 

Distance ' 

Type 

variance 

variance 

variance 

/7  m 

/7m 

98 

665  m 

DE 

6 

44,366 

69,023 

5.60 

96.6 

99 

DE 

1,058 

48,591 

71,392 

5.83 

93.0 

100 

DE 

6 

50,496 

74,630 

5.84 

91.5 

101 

DE 

0 

49,527 

76,319 

5.81 

90.6 

102 

DE 

0 

48,798 

77,307 

5.67 

89.2 

103 

DE 

6 

48,948 

79,529 

5.69 

88.6 

104 

DE 

6 

49,921 

82,203 

5.66 

88.1: 

105 

DE 

6 

50,054 

83,666 

5.72 

87.6 

106 

DE 

0 

51,445 

85,290 

5.72 

86.8 

107 

DE 

6 

53,025 

85,464 

5.79 

85.9 

108 

DE 

0 

55,139 

84,812 

5.81 

84.7 

109 

DE 

6 

55,707 

81,411 

5.80 

83.3 

110 

DE 

6 

55,545 

79,107 

5.78 

82.2 

111 

DE 

0 

51,118 

72,298 

5.77 

82.7 

112 

DE 

0 

45,907 

66,565 

5.63 

84.0 

113 

DE 

0 

39,890 

61,411 

5.58 

85.9! 

114 

DE 

6 

35,916 

57,262 

5.54 

87.2 

115 

665  m 

DE 

0 

33,002 

. 52,092 

5.50 

89.0 

116 

DE 

0 

30,091 

46,979 

5.37 

90.8 

117 

DE 

0| 

27,672 

42,148 

5.31 

91.5 

118 

DE 

0 

24,094 

38,685 

5.11 

93.1 

119 

DE 

0 

21,269 

36,464 

4.91 

94.9 

120 

. DE . 

0 

18,749 

33,032 

4.82 

95.5 

121 

665  m 

DE 

0 

15,730 

28,708 

4.59 

96  8 

122 

DE 

. . 6 

13,563 

23,278 

4.68 

96.3 

123 

DE 

6 

10,568 

18,439 

4.55 

98.9 

124 

665  m 

DE 

429 

6,564 

12,208 

4.23 

101.4 

125 

DE 

486 

3,590 

6,027 

4.06 

104.6 

126 

700  m 

DE 

1,507 

682 

911 

5.62 

142.0 

PCI 

245  m 

DS  29.69 

22.40 

31.67 

3.617 

19.29 

PC2 

210  m 

DE  I  9.670 

11.44 

17.42 

6.729 

44.79 

PC3 

245  m 

DS  1  1 .694 

1.085 

1.521 

4.739 

41.67 

PC5 

245  m 

DS  j  0.559 

0.086 

0.099 

2.543 

33.92 

Table  8.  Summary  statistics  of  model  parameters  of  hymap  wavebands. 


Parameter 

Minimum 

Maximum 

Nugget  variance 

911 

33249 

Sill  of  first  structure 

682 

589718 

Sill  of  second  structure 

113864 

464529 

Range  of  first  structure 

wmBEm 

Range  of  second  structure 

60.4 

(422.8  m) 

26.5 

(185.5  m) 

nm 
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Mapping  of  hyperspectral  wavebands 

Pixel  maps  of  the  digital  information  for  each  band  have  been  plotted  in  Figures  44-64. 
The  choice  of  map  colour  scale  was  based  on  the  minimum,  maximum  and  mean  values 
for  each  band.  To  enable  as  much  comparison  as  possible  the  same  scale  was  used  for 
groups  of  bands.  The  maps  show  clearly  the  range  of  spatial  textures  that  become  evident 
for  different  bands,  however  without  some  additional  expertise  from  TEC  it  is  not  clear 
how  to  interpret  the  variety  present.  It  is  evident  that  some  spectra  pick  out  the  detail  in 
the  built  up  area  in  the  north  central  part  of  the  map  whereas  others,  such  as  band  57 
seem  to  pick  out  the  physiography  more.  Many  maps  have  similar  features,  but  the 
texture  varies  even  for  those  with  similar  variograms.  The  maps  of  bands  63  to  65  and 
band  126  are  mainly  noise.  For  bands  63  and  64  this  was  expected  because  of  the  pure 
nugget  variograms.  Bands  65  and  126  have  large  nugget  to  sill  variances  of  42%  and 
49%,  respectively  which  is  expressed  by  the  speckled  appearance  of  their  maps. 


Factorial  kriging  of  selected  wavebands 

Wavebands  were  selected  on  the  basis  of  the  long-range  parameters  of  their  variogram 
models.  Figures  65  to  73  show  the  maps  of  the  long-  and  short-range  components  of  the 
variation.  The  short-range  component  was  fairly  consistent  for  the  different  wavebands 
and  this  is  evident  in  the  maps  based  on  this  scale  of  variation.  They  resemble  the  texture 
that  was  evident  in  the  1-m  imagery  for  the  site.  Roads,  tracks,  buildings  and  the 
woodland  texture  are  evident.  The  maps  of  the  long-range  component  differ  because  they 
are  based  on  different  spatial  scales.  Bands  95  and  126  (Figures  72  and  73)  have  the 
largest  long-range  component.  However,  the  variation  of  these  components  appears  to  be 
more  ‘noisy’  than  for  the  others  and  this  was  evident  in  the  raw  pixel  maps.  Bands  22, 47 
and  58  (Figures  66,  67  and  68)  have  the  shortest  long-range  components.  The  texture  in 
the  valley  (red)  and  spurs  (green  or  yellow)  can  be  seen.  This  probably  reflects  the 
intricate  physiography  of  the  area.  Bands  1 0, 66  and  83  (Figures  65,  69  and  70)  have 
intermediate  long-range  components.  The  texture  associated  with  the  small  tributary 
valleys  is  most  clear  in  these  maps. 

Further  discussion  with  personnel  at  TEC  is  needed  to  improve  the  interpretation  of  this 
analysis  as  with  the  raw  pixel  maps. 


Figure  45.  Pixel  maps  of  wavebands  7  to  12  for  the  hymap  image  data 
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;  46.  Pixel  maps  of  wavebands  13  to  18  for  the  hymap  image  data 
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Figure  47.  Pixel  maps  of  wavebands  19  to  24  for  the  hymap  image  data 
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Figure  48.  Pixel  maps  of  wavebands  25  to  30  for  the  hymap  image  data 


Figure  55.  Pixel  maps  of  wavebands  67  to  72  for  the  hymap  image  data 


Figure  58.  Pixel  maps  of  wavebands  86  to  90  for  the  hymap  image  data 
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Figure  64.  Pixel  maps  of  wavebands  1 15  to  120  for  the  hymap  image  data 
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Figure  64.  Pixel  maps  of  wavebands  121  to  126  for  the  hymap  image  data 
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Summary  of  classification  results 

Table  9  shows  the  groupings  of  the  bands  that  emerge  from  both  visual  (V)  and  statistical 
(S)  appraisals.  For  the  spectra  shown  in  Figure  18  it  is  the  major  changes  that  are  noted  in 
the  classification  in  Table  9.  For  the  maps  a  grouping  has  not  been  attempted  at  present  - 
it  is  merely  the  changes  in  the  texture  that  are  noted  in  Table  9.  The  groupings  identified 
from  the  correlation  matrix  are  also  reflected  in  those  for  the  first  5  Pcs  and  in  the 
experimental  variograms. 


Table  9.  Suggested  groupings  of  bands  based  on  visual  and  analytical  results. 


Method 

Major  changes  in  the  spectra 

Spectra  (V) 

13 

18 

35-37 

50 

60 

63 

70 

93 

PCI 

PC  2 

PC  3 

PC  4 

PC  5 

PCA 

18-19 

20-59 

1-13 

63-66 

126 

eigenvectors  (S) 

61-62 

95 

66-94 

99  -122 

Class  1 

Class  2 

Class  3 

Class  4 

Class  5 

Class  6 

Class  7 

Class  8 

Classification  of 

1-18 

70-93 

20-21 

27-39 

40-46 

22-26 

61  -62 

63 

raw  data  (S) 

64-69 

19 

57-58 

59-60 

49-56 

94-126 

47  -48 

Experimental 

1  -  17 

20-62 

63-64 

19,  65 

66-70 

96-125 

variogram  form 

71-94 

95 

18 

(V) 

126 

Variogramrlong- 

1  -  17 

19 

20-62 

63-64 

65-69 

96-101 

102-115 

95 

range  (S/V) 

74-94 

126 

Classification  of 

1-13 

14-18 

63-64 

27-28 

61-62 

20-27 

65-68 

model  parameters 

19 

69-94 

58-60 

49-57 

95-119 

(S) 

120-125 

126 

Sequence  of  changes  rioted  in  maps 


1-17 

18,  19, 

20 

21  -38 

39-46 

47-57 

58-62 

63-65 

66-69 

70-74 

126 

75-89 

90-92 

93-94 

95 

96-103 

104  — 

112 

113  — 
125 

Maps  (V) 
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Figure  66.  Factorially  kriged  estimates  for  waveband  22: 

a)  short-range  component  and  b)  long-range  component. 


I 

I 


□ 

cm 

m 


3000 

2500 

2000 

1500 

1000 

Below 


I 


3500 

3000 

2500 

2000 

1500 

1000 


Figure  68.  Factorially  kriged  estimates  for  waveband  58: 

a)  short-range  component  and  b)  long-range  component. 


Figure  70.  Factorially  kriged  estimates  for  waveband  83: 

a)  short-range  component  and  b)  long-range  component. 
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Introduction 


It  was  decided  while  on  a  visit  to  TEC  by  Mr  W.  Clark  that  Mr  E.  Bosch  and  Dr  M.  A. 
Oliver  should  extend  the  comparison  of  the  geostatistical  and  wavelet  analysis  that  they 
had  done  on  the  Fort  A.  P.  Hill  SPOT  image  to  a  large  data  set  of  soil  information.  The 
reason  for  this  was  to  see  how  the  techniques  performed  when  the  initial  data  are  a 
sample  rather  than  complete  cover  as  in  the  image.  For  the  SPOT  image  we  had  full 
cover  of  pixel  information  which  we  then  sampled.  Kriging  and  the  low  frequency 
wavelet  coefficients  were  used  to  restore  the  data  that  had  been  removed  (see  report  ?? 
and  Oliver  et  al.,  2000).  For  the  soil  data  we  started  with  sample  information  and 
resampled  this  for  the  analyses.  The  wavelet  analysis  of  the  soil  data  was  done  by  Mr  E. 
Bosch  during  Dr  Oliver’s  visit  to  TEC  in  June  2000. 


The  National  Soil  Inventory  of  England  and  Wales 

The  soil  data  that  we  have  analysed  are  part  of  the  National  Soil  Inventory  (NSI)  of 
England  Wales,  which  was  carried  out  by  the  Soil  Survey  of  England  and  Wales  between 
1978  and  1983  (McGrath  &  Loveland,  1992).  The  aim  of  the  survey  was  to  provide  a 
record  of  the  soil  information  in  these  countries  and  both  toxicity  and  deficiency  of  some 
elements  of  the  soil  that  affect  both  grazing  animals  and  arable  crops  at  the  national  level. 
For  the  NSI  to  be  an  unbiased  estimate  of  the  distribution  of  types  of  land  and  their 
properties,  strict  protocols  were  applied  to  site  location  and  description,  soil  sampling 
strategy,  and  soil  profile  description.  This  was  very  unlike  the  practice  of  free  soil 
survey  which  is  commonly  used  to  produce  conventional  soil  maps  (Avery,  1987). 
Considerable  effort  also  went  into  quality  control  of  pre-treatment  and  analysis  of  the 
samples,  data  recording,  error  trapping  and  construction  of  the  database,  because  of  the 
number  of  samples  and  the  magnitude  of  the  subsequent  analytical  programme 
(Loveland,  1990;  McGrath  &  Loveland,  1992). 

The  number  of  samples  was  restricted  to  those  falling  at  the  intersections  of  a  5-km 
orthogonal  grid.  The  sampling  grid  was  offset  1  km  north  and  east  of  the  origin  of  the 
Ordnance  Survey  National  Grid.  If  the  sampling  point  fell  on  anything  other  than  land, 
e.g.  on  a  road,  building,  water-body  etc.,  then  the  sampling  point  was  moved  100  m  north 
of  the  grid  node.  If  that  failed  to  locate  suitable  soil,  then  the  point  was  moved  100  m 
west  from  the  originally  intended  point.  This  process  was  repeated  in  steps  of  100  m  and 
200  m  from  the  grid  node,  in  the  order  north,  east,  south  and  west.  If  no  suitable  soil  was 
found  after  this  procedure,  then  the  site  was  abandoned  for  sampling  purposes,  although 
the  land-use  at  the  original  sampling  point  was  recorded  so  that  the  inventory  was 
complete  and  to  make  clear  the  reason  for  the  deviation.  If  a  new  sampling  point  was 
found,  then  the  standard  procedure  for  description  and  sampling  was  followed  at  that 
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point  (see  below).  In  this  way,  an  unbiased  record  of  the  occurrence  of  various  forms  of 
land-use  was  maintained. 

The  principal  interest  was  in  agricultural  land.  No  attempt  was  made  to  devise  a  sampling 
strategy  to  cover  urban  areas  adequately.  In  total  5691  sites  were  sampled.  The  grid- 
reference  located  the  site  to  within  10m  on  the  ground,  i.e.  to  an  accuracy  which  would 
place  any  return  visit  within  the  original  soil  sampling  sub-grid  (see  below). 


Sampling 

The  soil  profile  was  described  in  a  pit  dug  to  80  cm  (or  less  if  rock  was  encountered)  at 
each  sampling  point,  using  standard  terminology  (Hodgson,  1974).  However,  sampling 
was  restricted  to  the  uppermost  15  cm  of  mineral  soil  (or  less  if  rock  intervened),  or  of 
peat,  as  appropriate,  i.e.  litter  layers  were  not  sampled,  as  they  were  regarded  as 
ephemeral.  The  actual  sampling  depth  was  recorded.  Twenty-five  cores  of  soil  were  taken 
at  the  nodes  of  a  4m  grid  within  a  20  m  x  20  m  square  centred  on  the  Ordnance  Survey 
(OS)  5-km  grid-point.  The  cores  were  taken  with  a  screw-type,  mild-steel  auger,  to  avoid 
contamination  from  traces  of  elements  such  as  chromium  and  manganese  present  in 
stainless,  plated  or  similar  special  steels.  The  cores  of  soil  were  bulked  and  mixed  well  in 
the  field  and  double-bagged,  in  food-grade  polythene  bags,  and  a  waterproof  and  rot- 
proof  label  ('Synteape')  placed  between  the  bags. 

Samples  were  air-dried  and  milled  in  a  mild-steel  roller-mill  (Waters  &  Sweetman,  1955) 
to  pass  a  2-mm  aperture  sieve.  Preliminary  work  had  shown  that  no  detectable 
contamination  of  the  samples  arose  from  this  procedure.  The  resulting  data  set  comprises 
up  to  127  analytical  and  descriptive  parameters  for  each  of  5691  points  across  England 
and  Wales  (Loveland,  1990;  McGrath  &  Loveland,  1992).  This  collection  of  data  is  a 
unique  and  invaluable  resource, 

In  this  analysis  we  have  examined  only  pH  and  total  Zinc  because  they  represent  other 
variables  well.  From  a  principal  components  analysis  (PCA)  Zn  was  seen  to  load  heavily 
on  the  first  component  and  pH  on  the  third  axis,  which  reflected  the  effects  of  pareent 
material  (PM)  and  leaching.  The  pH  was  measured  by  a  combination  electrode  and  pH 
meter  in  a  1:2.5  soil- water  suspension  (MAFF,  1986)  on  soil  <2-mm.  Zinc  was 
determined  using  the  <150  micrometre  soil.  It  was  extracted  by  aqua  regia,  and 
determined  by  ICP-AES  (RES)  (McGrath  &  Cunliffe,  1985). 


Analysis 

Figures  74a  and  75a  show  the  full  set  of  pixel  information  on  the  5-km  grid  for  pH  and 
Zn.  (They  have  a  different  colour  scale  from  the  remaining  maps  because  they  were 
prepared  on  a  different  computer  for  the  Ministry  of  Agriculture  analysis.  Nevertheless 
the  variation  can  be  compared  and  the  relative  area  that  we  have  worked  on).  The 
analysis  for  this  project  was  carried  out  on  a  subset  of  the  full  NSI  data  This  was  because 
the  wavelet  analysis  requires  a  set  of  data  that  is  square  and  can  be  sampled  in  octaves. 
The  outline  of  the  data  for  England  and  Wales  is  irregular  and  we  selected  data  from  the 
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central  part  of  the  country  to  obtain  as  large  a  square  as  possible  that  would  suit  the  needs 
of  the  analysis.  This  resulted  in  a  data  set  with  3500  sites.  For  the  wavelet  analysis  the 
data  were  ‘padded’  with  zeros  so  that  there  were  no  gaps.  The  latter  arose  because  of  the 
shape  of  the  coastline  and  the  urban  areas  within  the  country  that  were  not  sampled  (see 
Figures  74a  and  75a);  they  appear  as  white  patches.  Figures  74b  and  75b  show  the  data 
that  were  extracted  for  the  analysis  in  this  report. 


Kilometres 


Figure  75.  Pixel  maps  for  Zinc:  a)  Original  data  for  England  and  Wales,  and  b)  Original 
data  selected  from  the  full  data  for  analysis 
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Table  10  gives  the  summary  statistics  of  the  subset  of  the  data  Zinc  was  strongly 
positively  skewed  which  can  be  seen  from  Table  10  and  Figure  76a,  therefore  it  was 
transformed  to  common  logarithms  (logio),  Figure  76b.  This  transformation  has  produced 
a  log-normal  distribution  which  is  common  for  many  elements,  Figure  76b  and  Table  10. 
The  histogram  for  pH  shows  that  this  has  a  close  to  normal  distribution.  Table  10.  A  near¬ 
normal  distribution  is  necessary  for  the  variogram  analysis  because  it  is  based  in 
variances,  which  are  unstable  if  the  data  do  not  have  a  near  normal  distribution. 


a)  Zinc 


b)  LogioZn 


c)pH 


Figure  76.  Histograms  from  the  subset  of  the  NSI  data  for  Zinc  and  pH. 


The  data  on  the  5-km  grid  were  further  subsampled  to  compare  the  results  of  data 
reconstruction  by  both  kriging  and  wavelet  analysis.  The  subsampling  produced  grids  of 
10-km  (1  site  in  every  block  of  4  sites  resulting  in  869  sites),  20-km  (lsite  in  every  block 
of  16  sites  resulting  in  219  sites),  and  40-km  (1  site  in  every  block  of  64  sites  resulting  in 
57  sites). 
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GEOSTATISTICAL  ANALYSIS 


Variogram  analysis 

The  spatial  structure  in  the  data  was  determined  by  computing  the  experimental 
variograms  of  pH  and  logioZn.  For  the  full  set  of  data  Zn  and  pH  showed  no  marked 
evidence  of  anisotropy,  therefore  omnidirectional  variograms  only  were  computed.  For 
both  the  full  data  and  the  subset  the  experimental  variogram  of  pH  showed  evidence  of 
trend.  The  semivariances  continued  to  increase  after  an  initial  sill  had  been  reached 
(Figure  77a).  This  suggests  the  presence  of  smooth  continuous  variation  that  violates  the 
assumptions  of  geostatistics,  which  assumes  that  the  variable  is  random.  Therefore,  we 
modelled  the  trend  by  linear  and  quadratic  functions  of  the  co-ordinates  so  that  the 
analysis  could  be  done  on  the  residuals  from  the  trend.  The  linear  function  was  less 
effective  in  accounting  for  the  trend  than  the  quadratic  one:  the  latter  removed  over  30% 
of  the  trend  in  both  cases.  The  variogram  was  then  computed  afresh  on  the  residuals,  and 
this  now  shows  a  more  simple  bounded  form.  Figure  77b. 

Most  of  the  variograms  were  fitted  by  nested  functions.  The  models  fitted  to  the  data 
included  single  exponential,  spherical,  and  power  functions  including  linear,  double 
exponential  and  spherical,  and  exponential  with  linear  functions.  For  logioZn  (Figure 
77c)  and  pH  of  the  residuals  double  spherical  models  provided  the  best  fit.  The  equations 
for  the  models  are  given  below. 


Double  exponential 

y(h)  =  c0  +  c,  {1  -  exp {-h //*,)}  +  c2 {1  -  exp (-h  / r2 )} 
where  c\  and  n  are  the  sill  and  distance  parameter  of  the  first  structure,  and  C2  and  r2  are 
the  sill  and  distance  parameter  of  the  second  structure. 


Double  spherical 


y(h)  =  c0  +  c,  j 

y(h)  =  c0  +c2< 
y(h)  =  c,  +c2 
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where  c\  and  a  1  are  the  sill  and  distance  parameter  of  the  first  structure,  and  c2  and  a2  are 
the  sill  and  distance  parameter  of  the  second  structure. 
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a)  pH  -  raw  data 
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c)  LogioZn 
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b)  pH  residuals 


log  distance/km 


Figure  77.  Experimental  variograms  (symbols)  and  fitted  models  (lines):  a)  raw  values  of 
pH,  b)  pH  residuals,  and  c)  logioZn. 


These  results  show  that  there  are  two  main  scales  of  spatial  variation:  a  short-range 
component  of  about  18  km  for  logioZn  and  37  km  for  pH  (residuals),  and  a  long-range 
component  of  61  km  for  logioZn  and  118  km  for  pH  (residuals).  The  average  short-range 
component  for  the  full  data  for  the  range  of  properties  examined  was  24  km,  and  the 
average  of  the  long-range  component  was  89  km.  There  are  slight  differences  in  the 
ranges  for  these  subsets,  but  they  are  within  similar  orders  of  magnitude.  A  characteristic 
of  the  variograms  of  the  subset  and  of  the  full  data  is  their  large  nugget  variance  (co):  it  is 
more  than  60%  of  the  sill  variance  for  most  variables.  Most  of  the  nugget  variance  can  be 
accounted  for  by  variation  over  distances  less  than  the  sampling  interval  of  the  grid.  This 
shows  that  the  5-km  grid  interval  misses  a  considerable  proportion  of  the  variation  in  the 
soil. 
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Figure  74  shows  the  pixel  map  of  pH.  There  are  two  spatial  scales  of  variation  evident  in 
the  map.  Areas  with  a  pH  of  less  than  6  are  in  the  western  part  of  the  country  in  the  main, 
which  is  also  where  the  main  uplands  are,  and  where  agriculture  is  dominated  by 
grassland  systems  -  optimum  pH  between  5  and  6..  These  are  also  the  wettest  parts  of 
England  and  Wales.  Much  of  central  and  eastern  England  has  pH  values  of  7  and  above, 
partly  reflecting  geology  and  the  distribution  of  calcareous  soils,  but  also  the  widespread 
use  of  lime  on  arable  soils  (optimum  pH  c.  6.5  -  7.5).  There  are  areas  of  lower  pH  in  the  S 
associated  with  the  Tertiary  sands  and  gravels.  The  E-W  differences  in  pH  values  reflect 
the  pattern  of  rainfall  as  well  as  elevation  and  land-use.  Figure  75  shows  the  original 
values  as  a  pixel  map  for  total  Zn.  There  are  many  areas  with  large  concentrations,  and 
the  most  extensive  of  these  follows  the  Jurassic  clay  band  from  SW  to  NE  across  the 
country.  There  are  other  areas  trending  N  to  S  from  the  Midlands  of  England  to 
Tynemouth  (not  on  the  subset  map).  These  seem  to  be  associated  with  the  Carboniferous 
shales  and  sandstones,  as  do  the  areas  of  large  concentrations  in  central  Wales. 


Kriging 

Ordinary  kriging  and  factorial  kriging  (kriging  analysis)  have  been  described  in  earlier 
reports  (Contract  Nos.  N68171-97-C-9029;  N68171-98-M-5311).  Ordinary  kriging  was 
used  to  reconstruct  the  data  after  subsampling  them  to  produce  smaller  data  sets.  Punctual 
kriging  was  used  so  that  the  estimates  and  maps  could  be  compared  with  predictions  from 
subsets  of  the  data.  The  estimation  grid  was  chosen  to  coincide  with  the  5-km  sampling 
grid.  Estimates  were  made  at  the  nodes  of  this  grid  so  that  we  could  compare  the  kriged 
estimates  at  the  sampling  points  with  the  original  values  where  these  had  been  removed. 
At  the  places  where  there  were  data  punctual  kriging  returns  the  sample  value  there.  The 
original  variograms  were  used  for  the  analysis  because  it  is  unlikely  that  their  structure 
would  change  over  time.  In  addition  those  from  the  subsets  have  large  nugget  variances 
and  they  are  less  reliable  because  there  are  few  comparisons  for  each  semivariance, 
especially  for  the  40-km  grid. 

For  pH  ordinary  kriging  was  done  on  the  residuals  and  the  quadratic  trend  added  back  to 
the  estimated  residuals  afterwards. 

The  ordinary  kriged  logarithmically  transformed  predictions  for  Zinc  were  back- 
transformed  for  mapping  so  that  the  variation  could  be  seen  on  the  original  scale  in  which 
the  variable  had  been  measured.  This  is  not  straightforward  because  the  kriging  variance 
must  be  taken  into  account.  The  equation  for  back-transformation  is: 

Z  =  exp{7(x0)  x  to  10  +  0.5cr2(x0)  x  (lnlO)2} 

where  Y(x0)  is  the  estimated  value  of  logioZn  at  xo  and  o2 y  is  the  estimation  variance. 

Factorial  kriging  was  done  on  the  full  set  of  data  to  examine  the  different  scales  of 
variation  in  the  data  and  to  compare  the  results  with  those  of  the  multi-resolution  wavelet 
analysis.  The  aim  is  to  filter  out  the  different  scales  of  variation,  so  that  the  independent 
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components  of  the  spatial  structure  can  be  examined  as  an  aid  to  further  interpretation. 
Factorial  kriging  estimates  the  long-  and  short-range  components  separately.  The 
variation  is  nested  for  both  pH  and  Zn;  the  variograms  have  two  spatial  structures.  The 
pixel  maps  of  the  raw  data.  Figures  74  and  75  suggest  that  there  are  two  scales  of 
variation,  and  this  is  confirmed  by  the  variogram  results.  The  variation  at  the  longer  scale 
appears  to  be  related  to  the  geology  for  Zn  and  pH  and  also  rainfall  and  elevation  for  the 
latter.  Zinc  is  a  good  example  of  many  of  the  other  variables  for  this  analysis.  For  the 
long-range 


WAVELET  ANALYSIS 

The  method  for  this  analysis  was  described  in  a  previous  report  (N68171-98-M-53 1 1)  and 
also  in  Oliver  et  al.  (2000)  for  SPOT  image  data.  This  is  the  first  analysis  that  we  are 
aware  of  using  soil  data  in  two  dimensions.  There  are  few  data  sets  in  the  world  for  soil 
that  are  on  a  grid  and  would  provide  adequate  data  for  this  analysis.  Wavelets  enable 
data  reconstruction  and  multi-resolution  analysis  by  deriving  the  low  frequency  and  high 
frequency  coefficients  from  the  data.  The  low  frequency  wavelet  transform  has  been  used 
to  restore  the  data  from  the  subsamples  on  the  original  5 -km  grid  and  to  identify  the  long- 
range  spatial  component  at  the  coarser  resolutions.  The  average  of  the  high  frequency 
wavelet  transforms  has  been  used  to  identify  the  short-range  component.  The  advantage 
of  wavelet  analysis  at  the  outset  for  the  pH  data  is  that  there  is  no  need  to  take  account  of 
trend.  An  important  advantage  of  this  analysis  is  that  it  is  unaffected  by  non-stationarity. 


RESULTS  FOR  pH 

The  following  series  of  maps  (Figures  78  to  80)  shows  the  reconstructed  values  of  pH 
from  ordinary  kriging  and  the  low  frequency  wavelet  coefficients  for  the  three  sampling 
grids.  One  noticeable  difference  between  the  maps  is  that  the  kriged  maps  appear  more 
‘spotty’.  This  is  because  kriging  returns  the  sample  value  at  the  data  point,  whereas  the 
wavelet  analysis  is  a  predicted  value  at  the  data  points  as  well  as  at  other  points.  Another 
difference  arises  from  the  fact  that  the  data  were  padded  for  the  wavelet  analysis  with 
zeros  -  these  are  the  larger  blue  areas  beyond  the  coastline  and  also  the  urban  areas  where 
there  were  no  sampling  locations.  Figure  78b  for  the  data  on  a  10-km  grid  there  is  slightly 
more  of  the  original  detail  in  the  variation  evident,  whereas  the  kriged  map  (Figure  78a) 
shows  the  effect  of  smoothing  from  kriging. 

For  Figure  79a  and  b  the  effects  of  the  greatly  reduced  number  of  sampling  sites  is 
evident  in  the  loss  of  detail.  For  Figure  79a  the  margin  around  the  map  is  because  there 
were  no  data  there  to  krige  from.  Kriging,  Figure  79a,  has  smoothed  the  variation  more 
than  wavelet  analysis,  Figure  79b,  and  the  spotty  appearance  of  the  former  map  is  the 
effect  of  punctual  kriging.  Figure  80a  and  b  shows  the  kriged  and  low  frequency  wavelet 
coefficients  for  the  40-km  grid.  It  is  clear  that  much  detail  has  been  lost  and  that  there  is 
more  difference  between  these  two  maps  than  between  those  in  Figures  78  and  79. 
Visually  the  wavelet  analysis  appears  to  have  performed  better  at  this  level  of  sampling 
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which  is  what  we  found  for  the  image  data  (Oliver  et  al.,  2000).  The  more  sparse  the 
sampling  the  better  the  wavelet  analysis  appears  to  perform  in  comparison  with  kriging. 
Figures  81  to  83  show  the  maps  of  the  comparisons  between  the  predictions  from 
ordinary  punctual  kriging  and  the  low  frequency  wavelet  transform,  and  the  original 
values  at  the  sampling  sites  of  the  5-km  grid.  Figure  81a  and  b  show  the  comparisons, 
i.e.  the  absolute  differences,  for  predictions  based  on  the  10-km  sampling  grid.  Figure 
81a  for  the  kriged  comparisons  is  a  more  spotty  map  than  the  one  from  the  wavelet 
analysis:  the  sampling  points  are  evident  as  the  blue  pixels  where  there  is  no  error.  For 
the  wavelet  analysis  for  this  sampling  grid  there  are  fewer  zero  or  small  errors  than  for 
kriging.  This  is  confirmed  by  the  histograms  of  the  differences,  Figure  84a  and  b. 
Kriging  also  appears  to  perform  better  for  predicting  the  values  from  the  20-km  grid  than 
the  wavelet  analysis  in  terms  of  the  small  errors,  Figure  82a  and  b.  The  histogram  of  the 
kriged  differences.  Figure  84c,  is  somewhat  misleading  because  there  are  fewer 
comparisons  for  kriging  than  for  the  wavelet  analysis,  and  it  is  likely  that  there  would  be 
more  of  the  larger  errors  than  is  evident  in  the  histogram.  The  slight  negative  skewness  in 
this  histogram  suggests  that  there  is  some  bias  in  the  predictions.  The  histograms,  Figure 
84c  and  d,  for  this  sampling  grid  (20-km)  are  more  similar  than  for  the  10-km  one.  The 
comparisons  for  the  predictions  from  the  40-km  grid  suggest  that  the  wavelet  analysis  has 
performed  slightly  better  at  this  level  of  sampling,  which  was  the  case  for  the  SPOT 
image  data  (Oliver  et  al.,  2000).  These  histograms.  Figure  84d  and  e  show  that  the 
wavelet  analysis  has  more  smaller  errors.  There  are  fewer  comparisons  for  kriging 
because  the  method  requires  a  minimum  of  4  points  within  the  search  radius  and  this  fails 
at  the  margins  of  the  error  when  the  sampling  points  become  sparse. 


Summary 

These  results  are  interesting  when  compared  with  the  analysis  of  the  SPOT  data.  The  NSI 
data  appear  not  contain  locally  non-stationary  data  for  pH.  These  would  occur  where 
there  are  marked  boundaries  in  the  soil,  for  example.  At  the  sampling  interval  used  here 
of  5-km  local  non-stationarity  is  less  likely  than  for  more  intensively  sampled  data  and 
remotely  sensed  data.  Therefore,  the  errors  for  kriging  are  less  than  they  were  for  the 
analysis  of  the  SPOT  data  where  there  were  marked  changes  at  lakes  and  other 
boundaries  causing  local  non-stationarity.  Since  kriging  is  an  exact  interpolator  and 
wavelet  analysis  is  not,  there  remains  the  need  to  combine  the  methods.  It  seems  that 
some  progress  on  this  has  been  made  at  the  Centre  de  Geostatististique,  Fontainebleau. 
However,  at  the  moment  it  is  difficult  to  ascertain  the  extent  of  this  and  we  shall 
endeavour  to  take  this  forward. 

Another  point  of  interest  from  this  analysis  is  that  the  pH  data  form  the  NSI  survey 
contain  long  distance  trend.  This  means  that  part  of  the  variation  depends  on  the  spatial 
coordinates.  This  violates  the  assumptions  of  geostatistics  in  the  same  way  as  local  trend 
or  drift,  i.e.  local  non-stationarity.  This  affected  the  variogram,  as  was  evident  above, 
Figure  77a  and  b,  and  meant  that  we  had  to  remove  the  trend  and  do  the  analysis  on  the 
residuals,  and  add  back  the  trend  after  kriging.  This  is  clearly  a  considerable  amount  of 
additional  effort  over  and  above  the  straightforward  analysis.  It  is  evident  from  the  results 
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of  the  wavelet  analysis  that  the  prediction  are  unaffected  by  the  trend.  Therefore,  if  there 
is  a  choice  of  method  available  -  situations  with  known  trend  present  would  benefit  from 
the  wavelet  analysis. 


Figure  80.  Predictions  of  pH  at  a  5-km  interval  from  data  on  a  40-km  grid:  a)  kriged 
estimates  and  b)  low  frequency  wavelet  transform. 
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Figure  82.  Comparisons  between  estimates  of  pH  from  data  on  a  20-km  grid:  a)  kriging 
and  b)  wavelet  analysis. 
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Figure  83.  Comparisons  between  estimates  of  pH  on  for  data  on  a  40-km  grid:  a)  kriging 
and  b)  wavelet  analysis. 
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a)  Kriged  pH  sampled  at  1  in  4 


Kriged  differences 

c)  Kriged  pH  sampled  at  1  in  16 


b)  Wavelet  pH  sampled  at  1  in  4 
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c)  Wavelet  pH  sampled  at  1  in  16 
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d)  Kriged  pH  sampled  at  1  in  64 
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e)  Wavelet  pH  sampled  at  1  in  64 
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Figure  84.  Histograms  of  the  differences  for  pH  from  kriging  and  wavelet  analysis. 
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Results  of  factorial  kriging  and  wavelet  analysis  for  pH 

Factorial  kriging  was  applied  to  the  data  on  the  5-km  grid,  but  the  equivalent  analysis  for 
wavelets  was  done  on  all  of  the  subsamples..  The  reasons  for  this  were  given  in  the 
previous  final  report.  Figure  85  shows  the  long-range  component  from  kriging  analysis. 
The  results  for  all  of  England  and  Wales  are  given  at  the  end  of  the  report,  Figure  98. 
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Figure  85.  Long-range  estimates  of  pH  from  kriging  analysis. 


The  map  of  the  long-range  estimates  for  pH  is  similar  to  the  kriged  estimates  from  the 
20-km  grid,  they  are  not  as  similar  to  any  of  the  low  frequency  wavelet  predictions, 
Figures  78  to  89.  The  long-range  variation.  Figure  85  shows  that  the  larger  values  are 
generally  associated  with  the  lowland  areas  and  the  limestone  uplands.  However,  the 
western  coastal  areas  have  large  values  of  pH  which  are  most  probably  associated  with 
the  deposition  of  sodium  ions  by  rain  in  these  areas. 

Figures  86b  and  87  a  and  b  show  the  high  frequency  wavelet  component  for  pH  from  the 
wavelet  analysis.  It  is  evident  that  the  result  for  the  20-km  grid  is  the  closest.  This  reflects 
the  same  resolution  for  extracting  the  long-range  component  also.  There  are  some 
similarities  in  the  detail  of  the  distributions,  but  there  are  also  differences.  In  the  future 
we  shall  examine  the  differences  between  these  particular  results  to  assess  their  relative 


performances  in  more  detail.  The  high  frequency  component  for  the  40-km  grid  has  not 
identified  the  relevant  short-range  component. 

Again  an  interesting  point  emerges  that  we  observed  in  the  previous  analysis  of  the  SPOT 
data.  The  level  of  resolution  at  which  the  wavelet  analysis  has  identified  the  long-  and 
short-range  components  of  the  variation  is  related  to  the  short-range  parameter  of  the 
variogram.  We  can  now  suggest  more  forcibly  that  for  a  multiresolution  analysis  using 
wavelets  the  best  approach  is  to  compute  the  variogram  first. 


Figure  86.  Short-range  variation  of  pH:  a)  from  kriging  analysis  based  on  the  5-km  grid, 
and  b)  the  high  frequency  wavelet  coefficient  from  data  on  the  10-km  grid. 
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RESULTS  FOR  ZINC 

Figure  75a  and  b  show  the  original  values  for  total  logioZn.  There  are  many  areas  with 
large  concentrations,  and  the  most  extensive  of  these  follows  the  Jurassic  clay  band  from 
SW  to  NE  approximately.  There  are  other  areas  trending  N  to  S  from  the  Midlands  of 
England  to  the  north.  These  seem  to  be  associated  with  the  Carboniferous  shales  and 
sandstones,  as  do  the  areas  in  central  Wales.  The  large  values  around  Avonmouth  (SW) 
are  associated  with  the  smelting  industry  there. 

For  Zn  the  common  logarithms  were  analysed  and  the  values  back-transformed  for 
mapping  as  described  above.  Figure  88a  and  b  shows  the  maps  of  the  predictions  using 
ordinary  punctual  kriging  and  the  low  frequency  wavelet  coefficients  for  data  on  the  10- 
km  grid.  The  results  are  similar.  The  spotty  appearance  of  the  kriged  map  arises  from  the 
fact  that  kriging  restores  the  data  at  the  sampling  points  with  non  error.  The  overall  result 
shows  that  kriging  smooths  more  than  the  wavelet  analysis.  Nevertheless  the  maps  are 
similar  to  those  for  the  original  data.  Figure  75b. 

The  pattern  of  variation  in  the  estimates  from  the  20-km  grid  for  both  analyses  is  also 
preserved  well,  Figure  89a  and  b.  The  degradation  in  detail  is  clear,  but  the  large-scale 
pattern  is  still  evident.  Again  the  results  for  both  methods  of  analysis  are  similar  -  more 
so  than  for  pH. 

Figure  90a  and  b  shows  the  results  for  the  40-km  grid.  The  results  from  the  wavelet 
analysis,  although  showing  a  loss  of  detail,  still  show  a  similar  pattern  of  variation, 
Figure  90b,  to  that  of  Figure  89b.  The  kriged  results  so  not  show  such  a  good 
resemblance  to  the  original  pattern  of  variation.  Note  particularly  the  loss  of  accuracy  in 
the  north  western  part  of  the  country. 


These  results  again  accord  with  the  findings  for  pH  and  for  the  analysis  of  the  SPOT 
image.  When  the  number  of  samples  is  few  and  the  distance  between  them  large  kriging 
restores  the  data  less  well  than  the  wavelet  analysis.  This  effect  is  supported  by  the  maps 
of  the  differences,  Figures  91  to  93  and  of  the  histograms,  Figure  94. 

Figures  91  to  93  show  the  maps  of  the  absolute  differences  between  the  predictions  from 
ordinary  punctual  kriging  and  the  low  frequency  wavelet  transform,  and  the  original 
values  at  the  sampling  sites  of  the  5-km  grid.  Figure  91a  and  b  shows  the  comparisons, 
for  predictions  based  on  the  10-km  sampling  grid.  Figure  91a  for  the  kriged  comparisons 
is  a  more  spotty  map  than  the  one  from  the  wavelet  analysis:  the  sampling  points  are 
evident  as  the  blue  pixels  where  there  is  no  error.  For  the  wavelet  analysis  for  this 
sampling  grid  there  are  fewer  zero  or  small  errors  than  for  kriging.  This  is  confirmed  by 
the  histograms  of  the  differences,  Figure  94a  and  b.  The  same  negative  skew  in  the  errors 
is  evident  for  Zn  as  for  pH.  Kriging  does  not  appear  to  have  performed  quite  as  well  for 
Zn  for  the  20-km  grid  as  the  wavelet  analysis  in  terms  of  the  small  errors,  Figure  94c  and 
d.  The  maps  of  the  differences,  Figure  92a  and  b  do  not  show  this  as  clearly.  Both 
methods  appear  to  have  performed  similarly  from  these  two  maps.  The  slight  negative 
skewness  in  this  histogram  again  suggests  that  there  is  some  bias  in  the  predictions.  The 
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comparisons  for  the  predictions  from  the  40-km  grid  suggest  that  there  is  less  difference 
between  the  wavelet  analysis  and  kriging  than  the  maps  of  the  estimates  suggested  there 
would  be.  Figure  93.  The  histograms,  Figure  94d  and  e  confirm  this,  although  direct 
comparison  is  not  possible  because  kriging  has  not  gone  to  the  edges  of  the  area  for  the 
reasons  given  before. 


Summary 

These  results  are  interesting  when  compared  with  the  analysis  of  the  SPOT  data.  The  NSI 
data  for  zinc  again  do  not  appear  not  contain  locally  non-stationary  data  as  for  pH.  This 
explains  the  somewhat  better  performance  of  kriging  for  the  10-km  grid. 

The  zinc  values  were  skewed  and  this  means  that  the  variances  when  computing  the 
variogram  are  unstable.  The  values  were  transformed  to  common  logarithms,  logioZn, 
and  kriging  was  performed  on  the  logarithms  and  these  values  were  back-transformed 
afterwards  so  that  the  values  could  be  shown  on  their  original  measurement  scale  as  for 
the  wavelet  analysis.  Again  this  is  clearly  involves  additional  effort  over  and  above  the 
wavelet  analysis,  which  does  not  require  non-normal  distributions  to  be  transformed.  This 
has  an  additional  advantage  because  the  transformation  causes  additional  smoothing  of 
the  predictions.  This  does  not  seem  to  be  particularly  evident  from  the  results  given  here. 
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Figure  89.  Predictions  of  Zn  at  a  5-km  interval  from  data  on  a  20-km  grid:  a)  kriged 
estimates,  and  b)  low  frequency  wavelet  coefficients. 


Figure  91.  Comparisons  between  estimates  of  Zn  from  data  on  a  10-km  grid  :  a)  for 
kriging,  and  b)  for  wavelet  analysis. 
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Figure  92.  Comparisons  between  estimates  of  Zn  from  data  on  a  20-km  grid:  a)  for 
kriging,  and  b)  for  wavelet  analysis. 


Figure  93.  Comparisons  between  estimates  for  Zn  from  data  on  a  40-km  grid:  a)  for 
kriging,  and  b)  for  wavelet  analysis. 
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a)  Kriged  Zn  sampled  at  1  in  4 
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d)  Kriged  Zn  sampled  at  1  in  64 
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b)  Wavelet  Zn  sampled  at  1  in  4 
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Wavelet  differences 


c)  Wavelet  Zn  sampled  at  1  in  16 
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Figure  94.  Histograms  of  the  differences  for  Zinc  from  kriging  and  wavelet  analysis 
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Results  of  factorial  kriging  and  wavelet  analysis  for  Zinc 

Factorial  kriging  was  applied  to  the  data  on  the  5-km  grid,  but  the  equivalent  analysis  for 
wavelets  was  done  on  all  of  the  subsamples..  The  reasons  for  this  were  given  in  the 
previous  final  report.  Figure  95  shows  the  long-range  component  from  kriging  analysis 
on  the  logarithmic  scale.  These  results  were  not  back-transformed  because  of  the  way  in 
which  these  estimates  are  derived.  The  long-range  kriged  estimates  for  logio  Zn  show  that 
the  largest  values  occur  near  to  the  Avonmouth  smelter  in  the  west  of  England,  and 
another  area  in  Derbyshire.  There  are  large  values  associated  with  the  Jurassic  clays 
trending  from  SW  to  NE,  to  the  Carboniferous  limestone  in  Derbyshire,  Carboniferous 
shales  in  the  NE  and  Ordovician  rocks  in  Wales.  This  distribution  has  some  similarities 
with  that  for  Cr.  The  values  for  logio  Zn  for  all  of  England  Wales  are  shown  in  the  end. 
These  results  show  the  closest  relations  with  the  20-km  grid  wavelet  analysis.  Figure  90b, 
even  though  the  colour  scales  appear  somewhat  different  because  Figure  95  is  for 
logarithms. 
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Figure  95.  Long-range  component  from  factorial  kriging  for  Zinc. 
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Figure  96.  Short-range  spatial  component  of  Zinc:  a)  from  factorial  kriging  on  the  5-km 
grid,  and  b)  high  frequency  wavelet  coefficient  for  data  on  the  10-km  grid 


Figure  97.  Short-range  variation  of  Zn:  a)  -  high  frequency  wavelet  coefficient  for  data 
on  the  20-km  grid,  and  b)  high  frequency  wavelet  coefficient  from  data  on  the  40-km 
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Figure  96a  shows  the  short-range  component  of  the  variation  from  kriging  analysis,  and 
the  maps  for  the  whole  of  England  and  Wales  for  this  analysis  are  given  at  the  end  of  the 
report,  Figure  99.  The  short-range  component  was  investigated  previously,  it  has  a  strong 
similarity  with  the  map  of  the  short-range  component  for  Cr  (not  shown).  These 
distributions  also  show  a  relation  with  the  small  scale  drainage  basins  and  local  changes 
in  rock  and  soil  types. 

Figures  96b  and  97  a  and  b  show  the  high  frequency  wavelet  component  for  Zn  from  the 
wavelet  analysis.  It  is  evident  that  the  result  for  the  20-km  grid  is  the  closest  to  that  for 
the  short-range  component  from  kriging  analysis.  This  reflects  the  same  resolution  for 
extracting  the  long-range  component  also.  There  are  some  similarities  in  the  detail  of  the 
distributions,  but  there  are  also  differences.  In  the  future  we  shall  examine  the  differences 
between  these  particular  results  to  assess  their  relative  performances  in  more  detail.  The 
high  frequency  component  for  the  40-km  grid  has  also  identified  some  of  the  relevant 
short-range  component. 

Again  an  interesting  point  emerges  that  we  observe  above  is  that  the  level  of  resolution  at 
which  the  wavelet  analysis  has  identified  the  long-  and  short-range  components  of  the 
variation  is  related  to  the  short-range  parameter  of  the  variogram. 
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Figure  98.  pH  —  results  of  factorial  kriging:  short-  and  long-range  components 
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Figure  99  Zinc  -  results  of  factorial  kriging:  short-  and  long-range  components. 
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PART  IV:  AIDE  MEMOIRE  -  SPATIAL  SAMPLING 

Introduction 

The  land  surface,  the  materials  of  which  it  is  composed  and  the  environment  more  generally  are 
continuous.  In  general  measurements  or  observations  can  be  made  on  only  small  portions  of  them, 
i.e.  on  samples,  because  of  the  large  areas  involved.  For  example,  in  a  single  agricultural  field  there 
is  an  infinite  number  of  potential  sampling  points.  Samples  intended  to  represent  the  areas  from 
which  they  are  drawn  must  be  planned  with  care.  The  information  from  a  sample  location  should 
represent  a  surrounding  area,  the  extent  of  which  we  might  not  know.  Since  many  environmental 
properties  vary  locally  in  a  complex  and  erratic  way  the  values  from  a  single  sampling  point 
include  a  sampling  effect.  To  increase  the  information  from  a  sampling  location  so  that  it  is 
representative  a  bulked  sample  can  be  taken,  and  provided  that  the  property  is  additive  the 
measurement  made  on  it  will  equal  the  regional  mean  apart  from  sampling  error. 

At  the  outset  consider  the  use  that  will  be  made  of  the  sample  information.  For  instance,  will  the 
mean  values  of  the  properties  observed  for  the  entire  area  or  for  strata  within  the  area  be  used  to 
predict  at  unsampled  places?  Or  will  the  information  be  used  to  predict  locally,  either  using 
mathematical  interpolators  or  geostatistical  ones.  For  either  of  the  latter  the  sample  data  must  be 
spatially  autocorrelated  for  them  to  have  any  merit. 

This  aide  lists  the  matters  that  must  be  considered  and  resolved  in  planning  sampling  of  a 
geographic  region,  which  for  present  purposes  we  treat  as  two-dimensional. 

Defining  the  target 

The  domain 

The  domain  is  the  region  of  interest.  Circumscribe  it  by  a  boundary  on  a  map  so  that  every  point 
can  be  assigned  to  the  domain  or  not  with  certainty.  The  domain  may  comprise  a  single  parcel  of 
land  or  several.  Denote  it  by  D. 

Support 

The  support  is  the  area  or  volume  of  material  on  which  you  make  measurements.  It  has  size  and 
shape,  and  may  have  orientation.  In  remote  sensing  it  is  the  'footprint'  of  the  pixel;  in  vegetation 
surveys  it  is  the  quadrat;  in  soil  survey  it  is  the  core  of  soil  taken  from  the  ground.  Cores  of  soil 
may  be  taken  from  areas  larger  than  the  cross-section  of  the  cores  and  bulked  for  analysis  in  the 
laboratory.  In  these  cases  the  supports  are  the  larger  areas. 

In  any  one  survey  define  the  support  and  keep  it  constant  throughout. 
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The  population  and  units 

Within  D  are  units  that  have  the  dimensions  of  the  supports.  In  a  remote  image  their  number  is 
finite  though  large.  In  soil  survey  they  are  so  many  that  they  may  be  regarded  as  infinite.  Define 
them  by  their  spatial  coordinates  and  their  spatial  extents.  Together  they  comprise  the  population. 
The  terms  ‘population’  and  ‘units’  may  be  used  to  refer  to  the  values  of  a  variable  of  the  supports. 

The  target 

Within  D  there  may  be  only  certain  kinds  of  terrain  or  land  use  that  are  of  interest,  e.g.  only  dry¬ 
land  (not  water),  only  farm  land  (not  towns,  not  parks,  not  golf-courses,  etc.).  The  units  falling  in 
these  classes  constitute  the  target  population.  The  others  do  not  belong. 

Samples 

Whole  populations  cannot  be  measured  in  ground  survey;  you  can  measure  only  subsets  of  the  units 
that  comprise  them.  Such  a  subset  of  units  is  a  sample. 

Typically  you  will  want  two  characteristics  in  a  sample  -  accuracy  and  reliability.  The  first  means 
that  a  sample  represents  the  population  without  bias,  i.e.  any  value  that  we  obtain  from  a  sample 
will  be  as  likely  to  exceed  the  true  value  of  the  population  as  it  will  be  to  fall  short.  The  second 
implies  that  repeated  sampling  will  give  sensibly  the  same  result.  It  is  measured  by  the  estimation 
variance  or  standard  error  of  the  mean,  s.e. 

These  characteristics  can  be  assured  by  the  sampling  design  in  which  there  is  sufficient 
randomness. 

Notation 

We  adopt  the  following  basic  notation. 

D  denotes  the  domain. 

|D|  is  the  area  of  D. 
z  is  the  variable  of  interest. 

Z  is  a  random  variable. 

Z(x)  is  a  random  process,  random  field,  stochastic  process,  in  which  Z  may  take  any  one  of  two  or 
more  values  at  random  at  each  point  x  in  D. 

N  is  the  size  of  the  sample  in  D,  i.e.  the  number  of  units  in  it. 

Dk  denotes  the  Jfcth  subdivision  of  D,  of  which  there  may  be  K. 
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nk  is  the  number  of  units  in  a  sample  of  Dk. 

H  denotes  the  mean  of  z  in  D. 

z  is  the  mean  of  the  N  data  drawn  from  D. 
o2  is  the  variance  of  z  in  D. 

s2  =  a2  is  the  estimate  of  o2  from  the  N  data. 

s2(D )  is  the  estimation  variance  of  /r  in  D. 

s(D )  is  the  standard  error  of  p. 

h  denotes  the  lag  separating  two  places,  and  is  a  vector  in  two  dimensions;  |h|  is  the  distance 
component  of  the  lag. 

jfh)  signifies  the  semivariance  at  lag  h. 

Xi  are  the  kriging  weights. 


Sampling  designs  for  design-based  estimation 

This  is  essentially  the  classical  statistical  approach  to  sample  design  and  prediction. 

Simple  random  sampling 

In  simple  random  sampling  N  units  are  chosen  with  equal  probability  from  the  target  population. 
The  result  is  unbiased,  and  the  estimation  variance  s2(D)  is  given  by  s2/N. 

If  there  is  any  spatial  correlation  at  the  working  scale  then  this  is  inefficient  in  the  sense  that  the 
same  estimation  variance  could  be  achieved  with  a  smaller  sample  by  a  better  design. 

Stratified  random  sampling 

Divide  the  region  into  strata,  Dk,  £=1,2, . . ,  K,  and  represent  each  by  a  few  units,  ideally  two, 
chosen  at  random  independently.  The  sizes  nk  may  be  chosen  in  proportion  to  the  areas  of  the  Dk, 
\Dk\,  if  they  are  not  equal. 

If  other  sizes  are  chosen  then  the  mean  in  D  may  be  calculated  as  the  weighted  average  of  the 
individual  stratum  means  with  weights  proportional  to  the  \Dk\.  The  estimation  variance  of  stratified 
sampling  depends  on  the  variance  within  the  strata,  or  the  pooled  within  stratum  variance.  In  the 
presence  of  spatial  dependence  the  latter  is  less  than  the  total  variance  in  the  population,  and  so 
stratified  sampling  is  more  efficient  than  simple  random  sampling. 

The  estimation  variance  is  given  by 
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A=1 

where  ^(D*)  is  the  estimation  variance  within  stratum  Z)*,  and  Wk  is  the  weight  assigned  to  the 
stratum.  The  weights  should  sum  to  1  to  avoid  bias. 

There  are  numerous  ways  in  which  this  general  scheme  can  be  elaborated  according  to  what  you 
know  of  the  region  and  the  variation  within  it.  For  example  the  strata  could  have  unequal  spatial 
extents  as  in  classification.  In  this  case  the  different  areas  are  taken  into  account  through  the  weight 
Wk,  such  that 

area  of  stratum  k 

wt  = - ; -  . 

total  area 


Systematic  sampling 

Sampling  is  usually  most  efficient  when  done  on  a  regular  grid.  It  has  two  disadvantages: 

(1)  it  provides  no  ready  estimate  of  the  variance; 

(2)  it  may  lead  to  biased  estimates  of  the  mean. 

The  first  arises  because  once  the  origin  and  orientation  of  the  grid  are  decided  there  is  no  further 
randomization  possible.  It  is  not  easily  overcome,  but  the  estimation  variance  may  be  approximated 
by  methods  such  as  Yates's  balanced  differences. 

The  second,  bias,  can  happen  where  there  is  trend  or  periodicity  in  z  in  the  region.  Periodicity  is 
usually  evident,  and  if  it  is  then  you  can  choose  an  interval  and  orientation  that  will  be  out  of  tune 
with  it.  Alternatively,  choose  a  non-aligned  scheme  in  which  each  sampling  point  on  the  grid  is 
offset  from  its  node  by  a  random  distance  along  its  row  and  down  its  column  according  to  a  rule. 

Sample  size 

The  size  of  sample  N  may  depend  on  the  budget  or  the  tolerance,  i.e.  error  that  can  be  tolerated  in 
the  estimate  from  the  survey.  If  the  budget  is  fixed  then  choose  a  stratified  scheme  to  minimize 
the  error  for  that  budget. 

If  the  error  is  specified  as  s(D)  then  for  simple  random  sampling 


N  =  s2/s2(D), 


The  formula  for  stratified  sampling  is  more  elaborate. 
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You  usually  do  not  know  s2  in  advance,  and  so  choosing  N  is  problematic.  Therefore  sample  in 
stages,  starting  with  a  sparse  design  that  can  be  intensified  as  necessary.  At  each  stage  calculate  the 
estimation  variance  to  see  whether  it  meets  the  tolerance.  If  it  does  then  stop;  otherwise  intensify 
the  sampling  and  recompute  the  estimation  variance  as  the  next  stage. 


Geostatistical  (model-based)  sampling  design  and  prediction 

Geostatistics  is  used  to  estimate  local  values  rather  than  regional  ones,  i.e.  to  predict.  It  is  based  on 
the  assumption  that  z  in  the  real  world  is  a  realization  of  the  random  process  Z(x).  For  this  reason 
there  is  no  need  to  randomize  the  sampling,  and  grid  sampling  is  preferred  because  of  its  efficiency. 

Geostatistical  prediction  (kriging)  requires  a  model  of  the  correlation  structure,  expressed  either  as 
a  covariance  function,  or  rather  more  generally  as  a  variogram.  Like  the  variance  in  design-based 
estimation,  these  functions  are  not  known  a  priori  and  must  be  estimated  from  sample  data. 
Sampling  must  therefore  serve  two  purposes: 

(1)  estimation  and  modelling  of  the  variogram,  and 

(2)  local  prediction  once  the  variogram  has  been  estimated  and  modelled. 

To  satisfy  item  (1)  sampling  must  be  sufficient  to  estimate  the  semivariances  precisely.  It  must  also 
be  dense  enough  to  estimate  the  spatial  characteristics  of  the  variation,  such  as  correlation  range 
and  general  form  of  the  variogram. 

Sampling  for  item  (2)  will  depend  either  on  the  budget  or  on  the  tolerable  error  of  local  predictions 
and  the  variogram. 

Sampling  to  estimate  the  variogram 

Nested  sampling  and  analysis 

Start  with  nested  sampling  and  a  hierarchical  analysis  of  variance  of  the  sample  data  if  you  know 
nothing  of  variation  in  the  region.  Choose  five  or  six  sampling  intervals  in  geometric  progression 
from  the  smallest  lag  distance  of  interest  to  the  largest.  Choose  the  angular  separations  at  random. 
Replicate  at  the  longer  distances  to  give  sufficient  degrees  of  freedom  in  the  analysis  of  variance  to 
estimate  the  components.  Expect  to  have  a  total  sample,  N,  of  about  100.  Figure  1  shows  the  kind 
of  sampling  plan  to  aim  for. 

Accumulate  the  components  of  variance  to  estimate  ^(|h|)  at  the  distances  of  the  design  and  draw  a 
crude  variogram  with  the  logarithm  of  |h|  on  the  abscissa  as  in  Figure  2. 
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Figure  100.  The  plan  of  sampling  for  one  main  centre  in  a  nested  survey  with  7  stages.  The  stages 
in  the  hierarchy  are  given  for  each  sampling  site. 
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Figure  101.  The  accumulated  components  of  variance  from  a  hierarchical  analysis  of  variance 
giving  a  first  approximation  to  the  variogram. 
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Such  a  result  this  can  be  used  to  identify  the  range  of  distance  within  which  most  variance  occurs 
and  to  plan  further  sampling  to  estimate  the  conventional  variogram. 


If  all  the  variance  appears  to  occur  within  the  smallest  distance  of  interest  then  local  prediction  is 
not  feasible.  So  stop!  Figure  3  shows  an  example  of  a  pure  nugget  reconnaissance  variogram.  All 
of  the  variation  is  occurring  within  the  shortest  sampling  interval. 


Figure  3.  A  pure  nugget  reconnaissance  variogram  from  a  nested  survey. 


Estimating  the  variogram  parameters 

Use  the  result  from  the  hierarchical  analysis  above  or  other  knowledge  of  the  variation  in  D  to 
estimate  semivariances,  h),  at  several  lags,  h,  within  the  correlation  range.  Design  a  scheme  with 
approximately  1 00  to  1 50  sampling  points  if  the  variation  appears  isotropic.  If  a  square  grid  with 
this  number  gives  you  sufficient  estimates  of  Kh)  within  the  correlation  range  then  use  it.  If  not 
then  cluster  the  sampling  in  some  way.  Intensify  sampling  around  a  subset  of  grid  nodes,  bearing  in 
mind  that  you  are  likely  to  want  a  grid  for  kriging  later.  Alternatively,  sample  in  clusters  with  a 
range  of  sampling  distances  between  locations,  and  spread  the  clusters  evenly  over  D  so  that  the 


Do  not  cluster  sampling  in  parts  of  D  that  you  know  or  suspect  to  have  unusually  large  values  ofz 
(as  you  might  in  mineral  surveys  or  pollution  studies)  or  unusually  small  ones  (as  in  studies  of 
deficiency  diseases).  This  will  result  in  bias. 

Compute  the  sample  variogram  and  plot  the  result.  If  the  estimated  values  fall  close  to  a  smooth 
curve  then  choose  an  authorized  model  to  describe  it,  estimate  its  parameters,  and  proceed  to 
kriging. 
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If  there  is  too  much  scatter  to  identify  a  plausible  function  then  increase  the  sampling,  either  by 
intensifying  the  grid  or  by  adding  clusters,  and  recompute  the  variogram.  Repeat  until  a  smooth 
form  is  identifiable. 

If  the  variation  is  anisotropic  and  you  wish  to  model  the  anisotropy  then  you  must  expect  to  sample 
at  200  points  or  more. 


Kriging 

In  kriging  Z  at  an  unknown  point  x0  minimize  the  prediction  variance 

o-2(xo)  =  2  £A,7(x0  -x,.)- XltWjrix, -x.)  ,  (1) 

/=i  /=!  j= i 


where  n  «  N  is  the  number  of  sampling  points  near  to  the  target  point  Xo.  The  quantities 
r(x,  ~xj)  andy(x0  -xy) depend  on  the  separations  x,.  -xy  andx0  -xy;  the  larger  these  are  the 
larger  is  c/(x0). 

The  maximum  value  of  c?(x o)  is  minimized  by  sampling  on  a  regular  grid.  A  triangular  grid  is 
usually  the  most  efficient,  but  rectangular  grids  are  almost  as  good  (Figure  3a),  and  as  they  are 
easier  to  lay  out  and  document  they  are  preferred.  If  variation  is  isotropic  then  use  a  square  grid. 

If  the  budget  is  fixed  then  sample  as  intensely  as  it  permits.  If  a  maximum  tolerance  is  specified, 
say  i'Kmax-  then  solve  the  kriging  system  for  a  range  of  sampling  intensities  (grid  intervals)  and  plot 
the  kriging  variance  (or  its  square  root,  the  kriging  error)  on  the  ordinate  against  the  grid  interval  on 
the  abscissa.  Connect  the  points  by  a  smooth  line,  Figure  3.  From  52Kmax,  or  .?Kmax,  draw  a  horizontal 
line  until  it  meets  the  curve,  and  from  that  intersection  drop  a  perpendicular.  The  value  at  which  the 
perpendicular  cuts  the  abscissa  is  the  required  sampling  interval.  Figure  3b. 

Determine  the  number  of  cores  in  bulked  samples  similarly.  Compute  the  estimation  variances 
using  Equation  (1)  for  equispaced  sampling  configurations  and  sample  sizes  from  4  to  about  50  and 
join  the  values  to  form  a  curve  (Figure  4).  Draw  a  horizontal  line  at  the  maximum  tolerable 
variance,  and  drop  a  perpendicular  from  the  point  at  where  it  intercepts  the  curve  to  the  abscissa. 
The  value  on  the  abscissa  is  the  optimum  size  of  sample. 
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Grid  spacing  Grid  spacing 

Figure  102.  Kriging  variances  from  (a)  punctual  kriging,  and  (b)  block  kriging. 
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Figure  103.  Graphs  of  standard  error  plotted  against  sample  size  for  bulking  from  4,  9, 16,  25,  36 
and  49  cores,  and  for  three  different  sample  supports. 
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APPENDIX  I 

Correlation  matrix  for  the  126  hyperspectral  bands. 
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0.916 

0.923 

0.932 

0.928 

0.934 

0.930 

0.935 

0.931 

0.936 

0.929 

0.933 

0.927 

0.930 

0.923 

0.926 

0.917 

0.920 

0.910 

0.913 

0.901 

0.903 

0.898 

0.900 

0.899 

0.902 

0.905 

0.907 

0.915 

0.917 

0.921 

0.923 

0.922 

0.924 

0.921 

0.924 

0.919 

0.923 

0.922 

0.925 

0.927 

0.932 

0.929 

0.933 

0.931 

0.938 

0.926 

0.929 

0.928 

0.935 

0.914 

0.918 

0.895 

0.899 

0.605 

0.598 

22 

23 

1.000 

0.998 

1.000 

0.999 

0.999 

0.999 

0.999 

0.998 

0.999 

0.997 

0.998 

0.996 

0.998 

0.995 

0.998 

0.995 

0.997 

0.996 

0.996 

0.994 

0.991 

0.991 

0.995 

0.991 

0.987 

0.988 

0.989 

0.983 

0.986 

0.983 

0.986 

0.985 

0.985 

0.985 

0.983 

0.985 

0.985 

0.985 

0.986 

0.985 

0.985 

0.983 

0.985 

0.983 

0.983 

0.714 

0.708 

0.724 

0.719 

0.734 

0.728 

0.742 

0.737 

0.748 

0.744 

0.757 

0.753 

0.766 

0.762 

0.771 

0.768 

0.771 

0.769 

0.758 

0.755 

0.699 

0.708 

0.920 

0.930 

0.937 

0.947 

0.935 

0.944 

0.929 

0.937 

0.941 

0.948 

0.943 

0.949 

0.942 

0.947 

0.942 

0.946 

0.938 

0.941 

0.935 

0.937 

0.930 

0.932 

0.924 

0.925 

0.916 

0.917 

0.906 

0.908 

0.903 

0.905 

0.904 

0.906 

0.910 

0.911 

0.920 

0.922 

0.927 

0.929 

0.929 

0.932 

0.929 

0.932 

0.929 

0.933 

0.933 

0.938 

0.940 

0.945 

0.942 

0.948 

0.946 

0.952 

0.941 

0.947 

0.945 

0.952 

0.930 

0.938 

0.911 

0.918 

0.613 

0.619 

24 

25 

1.000 

1.000 

1.000 

1.000 

1.000 

0.999 

1.000 

0.999 

0.999 

0.998 

0.999 

0.998 

0.999 

0.998 

0.999 

0.994 

0.996 

0.995 

0.996 

0.991 

0.993 

0.991 

0.992 

0.987 

0.988 

0.986 

0.988 

0.987 

0.989 

0.986 

0.988 

0.988 

0.989 

0.988 

0.990 

0.988 

0.989 

0.987 

0.988 

0.985 

0.987 

0.730 

0.864 

0.740 

0.868 

0.750 

0.873 

0.758 

0.877 

0.764 

0.880 

0.773 

0.884 

0.782 

0.888 

0.787 

0.888 

0.788 

0.888 

0.774 

0.869 

0.714 

0.629 

0.934 

0.836 

0.950 

0.860 

0.947 

0.868 

0.944 

0.875 

0.953 

0.895 

0.955 

0.905 

0.954 

0.913 

0.953 

0.921 

0.949 

0.925 

0.946 

0.928 

0.941 

0.929 

0.935 

0.929 

0.928 

0.929 

0.919 

0.925 

0.916 

0.926 

0.918 

0.927 

0.923 

0.929 

0.932 

0.930 

0.938 

0.929 

0.941 

0.924 

0.941 

0.921 

0.941 

0.918 

0.946 

0.912 

0.952 

0.910 

0.954 

0.901 

0.957 

0.898 

0.952 

0.880 

0.955 

0.879 

0.940 

0.855 

0.920 

0.831 

0.623 

0.544 

26 

27 

1.000 

1.000  - 

1.000 

1.000 

1.000 

0.999 

1.000 

0.999 

1.000 

0.999 

0.999 

0.996 

0.996 

0.997 

0.998 

0.993 

0.994 

0.993 

0.994 

0.990 

0.991 

0.990 

0.991 

0.990 

0.991 

0.990 

0.991 

0.991 

0.992 

0.991 

0.992 

0.991 

0.992 

0.990 

0.991 

0.989 

0.990 
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45 

0.794 

0.956 

0.978 

0.981 

0.979 

0.982 

0.984 

0.986 

0.987 

46 

0.802 

0.954 

0.974 

0.977 

0.973 

0.978 

0.980 

0.981 

0.983 

47 

0.831 

0.964 

0.972 

0.967 

0.969 

0.970 

0.972 

0.975 

0.976 

48 

0.854 

0.961 

0.957 

0.950 

0.951 

0.953 

0.955 

0.959 

0.960 

49 

0.866 

0.957 

0.946 

0.936 

0.937 

0.939 

0.941 

0.945 

0.946 

50 

0.869 

0.954 

0.942 

0.932 

0.932 

0.934 

0.936 

0.940 

0.941 

51 

0.871 

0.950 

0.935 

0.927 

0.924 

0.927 

0.930 

0.934 

0.935 

52 

0.873 

0.950 

0.934 

0.925 

0.922 

0.925 

0.928 

0.932 

0.933 

53 

0.869 

0.950 

0.936 

0.927 

0.925 

0.928 

0.931 

0.934 

0.936 

54 

0.864 

0.949 

0.938 

0.930 

0.927 

0.930 

0.934 

0.937 

0.939 

55 

0.859 

0.947 

0.937 

0.931 

0.927 

0.931 

0.934 

0.937 

0.939 

56 

0.858 

0.947 

0.937 

0.930 

0.927 

0.930 

0.933 

0.937 

0.939 

57 

0.860 

0.947 

0.935 

0.927 

0.925 

0.928 

0.931 

0.935 

0.936 

58 

0.862 

0.942 

0.929 

0.920 

0.917 

0.921 

0.924 

0.928 

0.929 

59 

0.867 

0.936 

0.918 

0.909 

0.905 

0.909 

0.912 

0.916 

0.918 

60 

0.874 

0.926 

0.902 

0.892 

0.887 

0.891 

0.895 

0.899 

0.900 

61 

0.877 

0.909 

0.879 

0.868 

0.860 

0.866 

0.870 

0.874 

0.875 

62 

0.861 

0.887 

0.858 

0.851 

0.838 

0.845 

0.850 

0.853 

0.854 

63 

0.149 

0.107 

0.087 

0.083 

0.076 

0.079 

0.079 

0.079 

0.080 

64 

0.261 

0.170 

0.129 

0.118 

0.108 

0.112 

0.113 

0.113 

0.114 

65 

0.508 

0.315 

0.222 

0.196 

0.183 

0.186 

0.189 

0.191 

0.192 

66 

0.664 

0.411 

0.286 

0.250 

0.234 

0.238 

0.241 

0.245 

0.245 

67 

0.715 

0.443 

0.307 

0.266 

0.252 

0.255 

0.258 

0.263 

0.263 

68 

0.733 

0.467 

0.331 

0.290 

0.277 

0.279 

0.283 

0.288 

0.288 

69 

0.753 

0.496 

0.363 

0.324 

0.309 

0.312 

0.316 

0.321 

0.321 

70 

0.772 

0.524 

0.394 

0.355 

0.341 

0.344 

0.348 

0.353 

0.353 

71 

0.792 

0.557 

0.429 

0.391 

0.377 

0.380 

0.385 

0.390 

0.390 

72 

0.809 

0.587 

0.463 

0.425 

0.412 

0.415 

0.420 

0.425 

0.425 

73 

0.824 

0.614 

0.494 

0.457 

0.444 

0.447 

0.452 

0.457 

0.457 

74 

0.836 

0.637 

0.520 

0.485 

0.472 

0.475 

0.479 

0.485 

0.485 

75 

0.846 

0.657 

0.543 

0.507 

0.495 

0.498 

0.503 

0.508 

0.508 

76 

0.853 

0.673 

0.561 

0.527 

0.514 

0.518 

0.522 

0.527 

0.528 

77 

0.859 

0.685 

0.577 

0.543 

0.530 

0.534 

0.538 

0.543 

0.544 

78 

0.861 

0.694 

0.587 

0.554 

0.541 

0.545 

0.549 

0.555 

0.555 

79 

0.865 

0.703 

0.598 

0.565 

0.552 

0.556 

0.560 

0.565 

0.566 

80 

0.867 

0.709 

0.605 

0.572 

0.560 

0.564 

0.568 

0.573 

0.574 

81 

0.869 

0.713 

0.609 

0.576 

0.564 

0.568 

0.572 

0.577 

0.578 

82 

0.870 

0.714 

0.610 

0.577 

0.565 

0.568 

0.572 

0.578 

0.578 

83 

0.871 

0.714 

0.609 

0.576 

0.564 

0.567 

0.572 

0.577 

0.577 

84 

0.871 

0.712 

0.607 

0.574 

0.561 

0.565 

0.569 

0.574 

0.574 

85 

0.870 

0.707 

0.600 

0.566 

0.554 

0.557 

0.561 

0.566 

0.567 

86 

0.865 

0.695 

0.587 

0.552 

0.539 

0.543 

0.547 

0.552 

0.552 

87 

0.864 

0.689 

0.578 

0.543 

0.530 

0.533 

0.537 

0.543 

0.543 

88 

0.861 

0.681 

0.569 

0.533 

0.520 

0.523 

0.528 

0.533 

0.533 

89 

0.856 

0.672 

0.559 

0.523 

0.510 

0.513 

0.518 

0.523 

0.523 

90 

0.852 

0.662 

0.547 

0.510 

0.498 

0.501 

0.505 

0.510 

0.510 

91 

0.847 

0.653 

0.535 

0.498 

0.487 

0.489 

0.493 

0.498 

0.498 

92 

0.841 

0.643 

0.525 

0.487 

0.476 

0.478 

0.482 

0.488 

0.488 

93 

0.838 

0.639 

0.520 

0.481 

0.471 

0.473 

0.477 

0.483 

0.483 

94 

0.815 

0.619 

0.502 

0.463 

0.454 

0.456 

0.459 

0.465 

0.465 

95 

0.390 

0.133 

0.021 

-0.011 

-0.024 

-0.023 

-0.021 

-0.019 

-0.020 

96 

0.525 

0.188 

0.038 

-0.006 

-0.020 

-0.020 

-0.018 

-0.014 

-0.015 

97 

0.549 

0.208 

0.055 

0.009 

-0.004 

-0.004 

-0.002 

0.003 

0.002 

98 

0.567 

0.230 

0.077 

0.032 

0.019 

0.019 

0.021 

0.026 

0.025 

99 

0.595 

0.268 

0.118 

0.073 

0.059 

0.060 

0.062 

0.066 

0.065 

100 

0.620 

0.294 

0.142 

0.096 

0.083 

0.083 

0.086 

0.090 

0.089 

101 

0.642 

0.321 

0.170 

0.125 

0.111 

0.112 

0.114 

0.119 

0.118 

102 

0.664 

0.349 

0.200 

0.155 

0.140 

0.141 

0.144 

0.148 

0.146 

103 

0.681 

0.371 

0.222 

0.177 

0.163 

0.163 

0.166 

0.170 

0.168 

104 

0.695 

0.388 

0.240 

0.195 

0.181 

0.182 

0.184 

0.188 

0.187 

105 

0.705 

0.402 

0.254 

0.209 

0.195 

0.195 

0.198 

0.202 

0.201 

106 

0.713 

0.414 

0.266 

0.221 

0.207 

0.208 

0.210 

0.215 

0.213 

107 

0.721 

0.426 

0.279 

0.233 

0.219 

0.220 

0.223 

0.227 

0.226 

108 

0.729 

0.438 

0.292 

0.247 

0.233 

0.234 

0.237 

0.241 

0.240 

109 

0.736 

0.450 

0.305 

0.261 

0.246 

0.248 

0.251 

0.256 

0.254 

110 

0.740 

0.456 

0.312 

0.268 

0.253 

0.255 

0.258 

0.263 

0.262 

111 

0.741 

0.457 

0.313 

0.268 

0.254 

0.255 

0.258 

0.263 

0.262 

112 

0.737 

0.450 

0.305 

0.261 

0.246 

0.248 

0.251 

0.255 

0.254 

113 

0.727 

0.434 

0.288 

0.243 

0.229 

0.230 

0.233 

0.238 

0.236 

114 

0.717 

0.419 

0.271 

0.226 

0.212 

0.213 

0.216 

0.220 

0.219 
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115 

0.704 

0.402 

0.254 

0.209 

116 

0.696 

0.392 

0.244 

0.199 

117 

0.687 

0.381 

0.232 

0.186 

118 

0.670 

0.358 

0.209 

0.164 

119 

0.657 

0.340 

0.189 

0.143 

120 

0.635 

0.313 

0.162 

0.117 

121 

0.622 

0.296 

0.145 

0.098 

122 

0.596 

0.267 

0.116 

0.071 

123 

0.586 

0.254 

0.103 

0.057 

124 

0.558 

0.228 

0.079 

0.036 

125 

0.535 

0.212 

0.068 

0.026 

126 

0.339 

0.117 

0.023 

-0.003 

28 

29 

30 

31 

28 

1.000 

29 

1.000 

1.000 

30 

1.000 

1.000 

1.000 

31 

1.000 

1.000 

1.000 

1.000 

32 

0.997 

0.996 

0.997 

0.998 

33 

0.998 

0.999 

0.998 

0.998 

34 

0.994 

0.993 

0.995 

0.996 

35 

0.994 

0.995 

0.996 

0.996 

36 

0.991 

0.993 

0.993 

0.993 

37 

0.991 

0.992 

0.993 

0.993 

38 

0.992 

0.992 

0.994 

0.994 

39 

0.991 

0.991 

0.993 

0.994 

40 

0.993 

0.993 

0.995 

0.995 

41 

0.993 

0.994 

0.995 

0.995 

42 

0.993 

0.993 

0.995 

0.995 

43 

0.992 

0.993 

0.994 

0.994 

44 

0.991 

0.991 

0.993 

0.993 

45 

0.988 

0.988 

0.991 

0.991 

46 

0.983 

0.983 

0.986 

0.987 

47 

0.977 

0.979 

0.980 

0.980 

48 

0.960 

0.962 

0.964 

0.964 

49 

0.947 

0.949 

0.951 

0.951 

50 

0.942 

0.944 

0.946 

0.947 

51 

0.936 

0.937 

0.940 

0.941 

52 

0.934 

0.936 

0.939 

0.939 

53 

0.936 

0.938 

0.941 

0.941 

54 

0.939 

0.941 

0.944 

0.944 

55 

0.940 

0.941 

0.945 

0.945 

56 

0.939 

0.941 

0.944 

0.945 

57 

0.937 

0.939 

0.942 

0.942 

58 

0.930 

0.932 

0.935 

0.936 

59 

0.919 

0.920 

0.924 

0.925 

60 

0.901 

0.902 

0.907 

0.907 

61 

0.876 

0.877 

0.882 

0.883 

62 

0.855 

0.855 

0.862 

0.863 

63 

0.080 

0.079 

0.082 

0.082 

64 

0.114 

0.113 

0.117 

0.118 

65 

0.191 

0.191 

0.197 

0.198 

66 

0.244 

0.245 

0.252 

0.253 

67 

0.262 

0.264 

0.270 

0.271 

68 

0.286 

0.288 

0.295 

0.295 

69 

0.320 

0.321 

0.328 

0.329 

70 

0.352 

0.354 

0.361 

0.361 

71 

0.389 

0.391 

0.397 

0.398 

72 

0.424 

0.426 

0.433 

0.433 

73 

0.456 

0.458 

0.465 

0.465 

74 

0.484 

0.486 

0.492 

0.493 

75 

0.507 

0.509 

0.516 

0.516 

76 

0.526 

0.529 

0.535 

0.536 

77 

0.542 

0.544 

0.551 

0.552 

78 

0.554 

0.556 

0.562 

0.563 

79 

0.565 

0.567 

0.573 

0.574 

80 

0.572 

0.575 

0.581 

0.581 

81 

0.577 

0.579 

0.585 

0.586 

82 

0.577 

0.579 

0.585 

0.586 

83 

0.576 

0.578 

0.584 

0.585 

0.195 

0.196 

0.198 

0.203 

0.201 

0.185 

0.186 

0.188 

0.193 

0.191 

0.173 

0.174 

0.176 

0.181 

0.180 

0.150 

0.150 

0.153 

0.158 

0.156 

0.130 

0.130 

0.133 

0.137 

0.136 

0.103 

0.104 

0.106 

0.110 

0.109 

0.086 

0.086 

0.088 

0.092 

0.091 

0.057 

0.057 

0.060 

0.064 

0.062 

0.044 

0.044 

0.046 

0.050 

0.048 

0.021 

0.022 

0.024 

0.027 

0.026 

0.012 

0.012 

0.014 

0.017 

0.015 

-0.016 

-0.014 

-0.012 

-0.011 

-0.012 

32 

33 

34 

35 

36 

1.000 

0.994 

0.999 

0.994 

0.989 

1.000 

0.993 

0.998 

0.997 

1.000 

0.995 

0.990 

1.000 

0.998 

1.000 

0.990 

0.997 

0.991 

0.999 

1.000 

0.994 

0.996 

0.996 

0.999 

0.998 

0.996 

0.994 

0.998 

0.998 

0.995 

0.995 

0.996 

0.997 

0.998 

0.996 

0.994 

0.997 

0.996 

0.998 

0.997 

0.995 

0.996 

0.996 

0.998 

0.996 

0.993 

0.996 

0.994 

0.997 

0.996 

0.994 

0.994 

0.995 

0.997 

0.995 

0.993 

0.992 

0.996 

0.995 

0.992 

0.991 

0.987 

0.995 

0.993 

0.989 

0.979 

0.986 

0.984 

0.993 

0.995 

0.963 

0.973 

0.970 

0.983 

0.986 

0.950 

0.961 

0.958 

0.974 

0.978 

0.948 

0.956 

0.957 

0.970 

0.974 

0.945 

0.949 

0.955 

0.965 

0.967 

0.942 

0.948 

0.953 

0.964 

0.967 

0.944 

0.951 

0.954 

0.966 

0.969 

0.948 

0.953 

0.958 

0.968 

0.970 

0.950 

0.953 

0.959 

0.968 

0.969 

0.949 

0.953 

0.959 

0.968 

0.969 

0.945 

0.951 

0.955 

0.966 

0.968 

0.940 
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